A Comprehensive Survey on ECG Signals as New Biometric Modality for Human Authentication: Recent Advances and Future Challenges

Electrocardiogram (ECG) has extremely discriminative characteristics in the biometric field and has recently received significant interest as a promising biometric trait. However, ECG signals are susceptible to several types of noises, such as baseline wander, powerline interference, and high/low-frequency noises, making it challenging to realize biometric identification systems precisely and robustly. Therefore, ECG signal denoising is a major preprocessing step and plays a crucial role in ECG-based biometric human identification. ECG signal analysis for biometric recognition can combine several steps, such as preprocessing, feature extraction, feature selection, feature transformation, and classification which is a very challenging task. Moreover, the employed success measures and appropriate constitution of the ECG signal database also play significant roles in biometric system analysis, considering that publicly available databases are essential by the research community to evaluate the performance of their proposed algorithms. In this survey, we review most of the techniques employed for the ECG as biometrics for human authentication. Firstly, we present an overview and discussion on ECG signal preprocessing, feature extraction, feature selection, and feature transformation for ECG-based biometric systems. Secondly, we present a survey of the available ECG databases to evaluate and compare the acquisition protocol, acquisition hardware, and acquisition resolution (bits) for ECG-based biometric systems. Thirdly, we also present a survey on different techniques, including deep learning methods: deep supervised learning, deep semi-supervised learning, and deep unsupervised learning, for ECG signal classification. Lastly, we present the state-of-art approaches of information fusion in multimodal biometric systems.


I. INTRODUCTION
Nowadays, biometric recognition systems as a key form of user authentication are used increasingly in different fields and applications such as smartphones, banks, websites, and airports, as a unique substitute to conventional authentication techniques (i.e., keys and personal identification numbers (PINs) [1]- [5], based on the required level of security. Among the most generally employed biometric traits such as fingerprints, palmprint, iris, voice, and face recognition, requires the enrolment of these traits in a database for feature recognition purposes, as shown in Fig 1. These different biometric modalities 1 have extensively been combined in several devices and systems to produce a single multimodal biometric system to improve recognition accuracy and deter spoofing [6]- [8]. Researchers have already started to address the problem of spoofing of the traditional biometric systems in Electrocardiogram (ECG) biometrics.
Recently, the ECG has achieved a somewhat incredible niche as a source of security and privacy in the form of a 1 A biometric modality refers to a system built to identify a particular biometric trait [3], [5]. Notably, a biometric modality combines a biometric trait, sensor type, and algorithms for extracting and processing the digital representations of the trait. biometric; and researchers have begun investigating its use as an emerging biometric modality to identify and authenticate individuals [10]- [12]. Perhaps the key benefit of the ECG-based biometric system is the inherent liveness detection that makes it different from traditional biometrics [10]. This characteristic property will be beneficial in the fusion of ECG signals with other robust biometric modalities to provide a biometric system that is spoofing attack-resistant [13], [14]. The work in [10] shows that the ECG biometrics can achieve satisfactory identification and authentication accuracy for diverse applications, using only the QRS complex of ECG signals. Biometric recognition relying on the ECG signal dates back to the pioneering work of [21], where 12 channels of ECG signal were employed. Research investigations in [22]- [24] have shown that the ECG trace exhibits unique or discriminatory patterns among individuals. ECG-based biometric authentication systems can likewise be categorized according to the classifier employed to perform the task of recognition [5], such as k-nearest neighbor (kNN), linear discriminant analysis (LDA), neural networks, generative model, support vector machine (SVM), and match score classifiers. Notably, all classifier-based recognition methods rely on the so-called feature extraction method [1], [25], [26], where they examine the raw ECG signal to extract some significant features to be applied as input to the classifier. However, transforming the raw ECG signal into a proper feature vector for classification has to be thoughtfully performed and requires significant skillful experience [5]. Deep learning, a typical approach for representation learning, subdue this challenge by performing feature extraction in an automated way by adopting multiple layers to represent the abstractions of the raw ECG signal [1], [15]- [20]. This permits researchers to extract discriminative features of the raw ECG signal without expert knowledge. The deep learning algorithms contain a layered structure of data representation, where the lower layers extract the low-level features, while the upper layers extract the high-level features [19], [20].
After features extraction, the derived feature sets are potentially deposited in the database or sent to the classifier for recognition. Generally, the ECG biometrics process combines the following main components: signal preprocessing and QRS detection, feature extraction, feature selection, feature transformation, and classification [14], [27]. Several methods for ECG-based biometric system for human authentication have been introduced to capture valuable information from the ECG signal [14], [25], [28]. For instance, fiducial methods, non-fiducial methods, and partially fiducial methods [23], [25], [29]- [33]. The fiducial-based technique depends on accurate detection of the reference points localization in each heartbeat trace like P wave, QRS complex, and T wave and employs interval, amplitude, angle, and area of these points as the biometric features [23], [25], [29]. While several fiducial-based ECG biometrics recognition methods are introduced to satisfy biometric identification requirements criteria, accurate localization of fiducial points remains a big challenge [25], [29], [30], [32]. Contrarily, the non-fiducial-based ECG biometrics detection method typically does not require to detect fiducial points [29]. Principal components (PCs) [22], [23], [34], autocorrelation coefficients [14], [25], [28], and wavelet coefficients [35] are some examples. Partially fiducial methods combine fiducial and non-fiducial methods to locate only the R-peaks (in the QRS complex), which are employed to segment the ECG signal into single heartbeat waveforms [36] and then extract the time domain or frequency domain information as the features [37]- [40]. Since the R peaks are commonly known to be the highest and sharpest peaks compared to other fiducial points [38], [39]. For example, the work in [5] detects the QRS-complex in ECG signals and generates each four QRS complex to a QRS vector. In recent years, deep neural networks (DNNs) are widely used for ECG signal classification and feature extraction and achieve efficient results in ECG-enabled biometric systems for human identification and authentication [37], [40]. However, the biggest challenge in these methods is that the generalization ability is limited, particularly in the matching task [18], [19], [41]- [43]. That is, these methods typically predict unconvinced results for individuals who have not been initially authenticated.

A. BACKGROUND AND PRIOR WORK
The works of [2]- [5], [11], [15], [49], [50], on ECG biometric focus on heartbeats classification in healthy and non-healthy utilizing methods like CNNs, autoencoders, or DBNs. The authors in [32] employed the autocorrelation features with a non-overlapping window to build features, while the approach in [30] applied the autocorrelation/discrete cosine transform (DCT) feature of the ECG signal without fiducial detection. It is worth noting that the non-fiducial approach-based method does not require precise boundaries of the waveforms since it eliminates the need for fiducial point localization [31], and it typically employs statistical features, autocorrelation features [32], or wavelet features [35]. Moreover, the ECG subspace-based method also attracts much consideration [7], [28]. The basic concept of sparse representation in ECG biometrics is to approximate the original ECG signal employing only a few columns of a dictionary. For instance, sparse representation has realized a better performance in fields of signal processing [51]- [53], computer vision, hybrid precoding [54]- [58], face recognition [59]- [61], and pattern recognition [62]. For standard discriminative sparse representation, the learningbased approach (like K-SVD 2 [63]) and the analytic approach (like wavelet [65]), are the two popularly utilized dictionary construction approaches employed. There have been several efforts to study diverse deep learning methodologies for ECG biometrics [68], [69]. Such methods have shown much better identification accuracy compared to the traditional approaches [11], [70], [71].
Many existing survey papers that discussed diverse aspects of ECG-based biometrics recognition have been presented [12], [25], [72]- [78]. Since accurate ECG classification is a challenging problem, the authors in [78] presented a survey of preprocessing techniques, ECG databases, feature extraction techniques, classifiers, and performance measures were presented. However, topics regarding feature selection, feature transformation for ECG-based ECG classification problems are still missing. The authors in [72] presented a survey on the evolution and current challenges of the ECG-based biometrics systems, discussing topics such as inter-subject and intra-subject variability, acquisition, and ECG databases for biometric systems. However, topics regarding feature selection, feature transformation, classification of ECG-based biometric systems, were not addressed. In [73], the authors presented a survey on ECG analysis and discussed topics such as preprocessing, feature extraction, feature selection, feature transformation, classification, and ECG databases, application fields, and success measures. However, topics concerning deep learning techniques such as CNN, multilayer perceptron (MLP), DBN, LSTM, Bidirectional Recurrent Neural Network (BRNN), and GRU are still missing. The works in [74] presented a thorough and systematic survey on ECG databases for biometric systems, and topics such as ECG variability sources, ECG databases, database categorization, acquisition hardware, and acquisition protocols, were discussed. However, topics regarding feature selection, feature transformation, classification of ECG-based biometric systems, deep learning techniques, were not addressed. In [25], the authors presented a review of ECG-based biometric systems and discussed topics related to the fiducial and non-fiducial techniques of feature extraction. However, topics regarding feature transformation, classification of ECG-based biometric systems, databases, and deep learning techniques were not addressed. Based on physiological and behavioral modalities, the authors in [75] presented a comprehensive survey on the biometric recognition systems, and only discussed some of the major findings and results of the ECG-based biometric system. While the survey in [12] majorly focuses on ECG variability sources and ECG databases for biometric systems, topics regarding feature selection, feature transformation, classification of ECG-based biometric systems, and deep learning techniques are still missing. The authors in [76] presented a review on the techniques of ECG processing from a pattern recognition viewpoint and discussed topics such as ECG Variability, feature extraction/selection/transformation, ECG preprocessing, and feature detection. However, topics regarding classifiers and deep learning techniques are still missing. In [77], a review of ECG-based 97762 VOLUME 9, 2021 authentication systems is presented and topics discussed include the existing ECG benchmarks, fiducial and non-fiducial features authentication and methods, and data mining classification techniques. However, topics regarding ECG feature transformation, classification, and deep learning techniques are still missing. The survey in [17] summarizes research that applied deep learning models to ECG data and discussed topics such as biometric human identification, and deep learning. However, topics regarding feature selection, feature transformation, and classification of ECG-based biometric systems are still missing. Moreover, researchers drawn in implementing ECG-based biometric authentication systems can obtain various public databases, including offthe-person, on-the-person with healthy and no-healthy ECG signals. However, several databases devised for ECG-based biometric authentication lack standardized hardware and measurement protocol of signal features, which is a challenging problem for ECG-based biometric systems. For example, this challenge restricts robust ECG-based biometric authentication system validation, which does not support the development of expert systems robust to the variability furnished by diverse use cases.

B. PAPER CONTRIBUTION
In this survey, we present comprehensive existing literature studies on ECG-based biometric recognition systems for user authentication. To the best of the authors' knowledge, a comprehensive survey that addresses several topics on ECG biometric systems based on preprocessing, feature extraction, feature selection, feature transformation, deep learning for ECG classification, databases are still missing. This survey paper proposes to fill this gap by presenting a comprehensive study on several ECG-based biometric recognition topics to allow interested readers to reach the sought knowledge rapidly. Moreover, only a few studies surveyed the utilization of deep learning methods for ECG-based biometric systems, where deep learning-based ECG uses algorithms such as backpropagation, CNN, recurrent neural network (RNN), LSTM, generative Adversarial Network (GAN), restricted Boltzmann machine (RBM), DBN to extract features from samples of users during the training and classification phase.
Although relatively few review papers appear in the literature on ECG analysis, they are limited exclusively to only a few topics rather than all as shown in Table 1. Therefore, this work presents a comprehensive survey on all aspects of ECG signal analysis to immediately motivate and provide a guide for feature research in using ECG signals as a biometric trait for human identification and authentication. The various contributions of this paper are summarized as follows: • We first present a survey on ECG signal by conducting a deep overview and discussion on various steps of ECG analysis such as preprocessing, feature extraction, feature selection, and feature transformation for ECG-based biometric systems.
• Secondly, we fill the gap in biometric research by presenting a comprehensive and well-organized survey of the available ECG databases. The aim is to evaluate and compare the acquisition protocol (i.e., number of subjects, gender distribution, location of electrodes), acquisition hardware (i.e., number of ECG channels, types of electrodes, number of contact points, and acquisition frequency), and resolution (bits) of ECG acquisition employed to identify the fundamental requirements and useful guidelines in ECG-based biometric systems.
• Thirdly, since deep learning is considered a promising method for analyzing and classifying ECG data for biometric recognition, we present an in-depth review on ECG signal classification with deep learning methods, under the categories of deep supervised learning (DSL), deep semi-supervised learning (DSSL), and deep unsupervised learning (DUL), which can be used for analyzing and classifying ECG data for biometric systems. Specifically, we summarize existing deep learning studies employing ECG data from various aspects and emphasize existing challenges to identify potential future research directions.
• Lastly, since multimodal biometric systems typically overcome several limitations and provide greater accuracy and more flexibility than unimodal biometric systems, we present several information fusion methods for biometrics and summarize to conclude with direction on future research directions in a multimodal biometric system using ECG. To conduct a comprehensive review, we specifically review the articles of existing studies on ECG-based biometric systems published in journals indexed by prestigious scientific indices such as Science Citation Index and Science Citation Index-Expanded, mostly from the last two decade.
To avoid missing articles published in journals that do not explicitly state these keywords in their titles, we expanded our search to incorporate all fields per article. Notably, many unrelated articles stated some of the keywords in their introduction sections or related work sections, which gave rise to a large initial set of articles.

C. PAPER ORGANIZATION
This paper is organized as follows: In Section II, We provide an overview on ECG basics. Section III describes ECG-based biometric recognition, highlighting the basic biometric system characteristics. Section IV describes the selected ECG Signal Databases introduce by different authors, highlighting the main characteristics of both on-the-person and offthe-person acquisition of subjects. Section V details the framework for the analysis of ECG signals for human identification. In Sections VI and VII, the ECG signal application domains and the evaluation metrics employed for ECG-based biometric recognition evaluation are given, respectively.

II. ECG BASICS
The ECG records the various electrical potential generated by the heart on the surface of the body [79], acquired typically via electrodes placed on the skin. These electrodes capture voltage changes through depolarization and repolarization of cardiac cells, stimulating contraction and relaxation of the cardiac muscle [80], [81]. The recent increase in the availability of low-cost portable ECG sensors has opened doors to new areas such as fitness monitoring [82] and wearable biometric authentication devices [83]- [86], resulting in the pervasive acquisition of ECG data. While the interested reader can consult [87]- [90] for a detailed description of the physiological principles of cardiac electrophysiology, we now describe the (single-lead) ECG for one cardiac cycle illustrated in Fig. 1. Specifically, one cardiac cycle in a typical ECG waveform consists of three waves, namely P, QRS (a wave complex), and T that maps specific heart events [93].

A. P-WAVE
Notably, the P wave describes the depolarization of the left and right atrium as shown in Fig. 3 and likewise relates to atrial contraction, and its duration ranges from 0.06 to 0.12 seconds [89], [94].

B. QRS-COMPLEX
The QRS Complex, as the name suggests, comprises of the Q, R, and S waves [74], [94], which represents ventricular depolarization. The Q wave is short (<0.03 seconds in duration), downwards i.e., an initially negative deflection of the QRS complex, which corresponds to the depolarization of the septum. The R wave reflects depolarization of the FIGURE 2. A representation of a typical cardiac cycle with the related waves of a single-lead ECG signal [91], [92]. The heart and its conductive tissues [91], [92].
left ventricle (apex). The S wave is short, downwards, and negative, which corresponds to the depolarization of the basal and rear regions of the left ventricle. The duration of the QRS complex ranges from 0.06 to 0.09 seconds. Ventricular rate can be determined by calculating the time duration between QRS complexes.

C. PQ-SEGMENT
The PQ segment depicts the delay in transmission of the impulse at the auriculoventricular node and it is usually zero potential [36], [95].

D. PQ-INTERVAL
The PQ interval depicts the time required to depolarize the auricular musculature plus the delay time encountered at the auriculoventricular node until the beginning of depolarization of the ventricles [36], [95]. Specifically, the PQ-interval originates from the onset of the P-wave to the onset of the QRS-complex. It is worth mentioning that the repolarization wave of the auricles (termed the auricular T-wave), is typically low in amplitude and not normally observed. It is usually added in the PQ-segment and the QRS complex. The PQ interval, therefore, is the outcome of the electrical activity of the auricles.

E. ST-SEGMENT
The ST segment depicts the duration of the depolarized phase of the ventricles. It is the time between the completion of depolarization and the beginning of repolarization while the chemical processes of depolarization are attempting to reverse themselves [95]. The ST segment is usually zero potential. However, it may elevate above or depress below the zero potential baselines.

F. T-WAVE
The T wave represents ventricular repolarization, with duration ranges between 0.1 sec and 0.25 sec.

G. ST-INTERVAL
The ST interval is the time from the completion of ventricular depolarization to the completion of their repolarization [95].

H. QT-INTERVAL
The QT interval depicts the whole time required for depolarization and repolarization of the ventricles. That is, QT-interval starts from the onset of the QRS-complex to the end of the T-wave.

III. ECG SIGNAL IN BIOMETRICS
In [34], an extensive set of ECG descriptors that characterize heartbeat trace is presented employing 29 subjects through 7 data recordings sessions, using short measurements of 120 s length [95]. Here, results show that extracted features are independent of sensor location and invariant to the state of anxiety of an individual. The work in [96] presented an ECG-based recognition system using Bayes' theorem on a database containing 502 ECG recordings. By utilizing short measurements of 10 s length, an ECG-based biometric recognition system is presented in [97], using a test set of 234 ECG recordings from 74 subjects. In [35], an ECG-based biometric recognition is presented, employing 50 subjects through 3 data recordings sessions, resulting in a classification accuracy of 95%. The authors in [22] proposed an ECG-based biometric recognition for human identification employing 43 subjects, where the ECG recordings of subjects are acquired while performing seven 2-min tasks per session. Here, the method in [22], namely eigenPulse, exploits principal component analysis (PCA) that is common with traditional biometrics such as fingerprint and iris. In [98] a screening method for evaluating the quality of each segmented heartbeat is presented via a PCA technique, namely eigenpulse by employing 65 subjects through 6 data recordings sessions and resulted in performance improvement in recognition accuracy. In [99], a single lead ECG-based biometric applying a short-time frequency approach with robust feature selection is introduced from a relatively large VOLUME 9, 2021 sample of 269 subjects, with data acquired on three separate occasions over a 7-month period, which achieved an EER verification of 5.58%, rank-1 recognition accuracy of 76.9%, and rank-15 recognition accuracy of 93.5%. In [100], a single lead ECG-based biometric applying a short-time frequency approach with robust feature selection is introduced, which uses a test set of ECG recordings from 168 subjects through 1 data recordings sessions by employing short measurements of 90 s length with a rank-1 recognition accuracy of 98%. In [101], an ECG-based biometric system employing a Birge-Massart strategy, is proposed, using only a subset of the wavelet coefficients to determine the signal difference/similarity measures. Here, ECG recordings from 30 subjects through 2 recording session, is employed. The authors in [102] introduced a recognition technique based on ECG signals, which enhances the autocorrelation /LDA feature extraction algorithm by incorporating the periodicity transform (which is robust in handling heart rate changes). Here, the performance of the system in [102] over 52 subjects with ECG recordings through 1.2 recording session by employing short measurements of 180 s length, is 92.3%. The authors in [103] introduced a set of standards for ECG-based biometric recording and the UofT ECG Database (UofTDB) for performance evaluation using 1012 subjects through 1.6 data recording session by employing short measurements of 180 s length. The authors in [104] introduced short-term and long-term public datasets, with ECG data acquired at the hand palms and fingers using dry silver/silver chloride (Ag/AgCl) electrodes and electrolycra strips. Here, they employed 128 subjects through 1.5 data recordings sessions by utilizing short measurements of 120 s length.
In [105], a 1D convolutional long short-term memory neural network for an ECG-based biometric system is proposed on a public Physionet database. Here, data recordings of 109 subjects were acquired using 16 channels of ECG signals and short measurements of 1 s length, which achieved an EER of 0.41% and rank-1 recognition accuracy of 99.58%. The authors in [106] introduced an ECG-based biometric system employing eigenvector centrality on a public physionet database consisting of ECG data of 109 subjects by using 64 channels of ECG signals and short measurements of 12s length, which achieved an EER of 4.40% and rank-1 recognition accuracy of 92.60%. The authors in [107] also proposed a network structure, namely ECG-based subject identification (ES1D). ES1D is a modification of the conventional CNN, which uses Welch's power spectral density estimation of ECG signal acquired from a public database DREAMER with 23 subjects and achieved an accuracy of 94.01%. In [108], a deep CNN-based ECG biometric system is proposed with a data augmentation technique and assessed on a Physionet ECG Database [26] comprising 109 subjects, using 64 ECG channels each sampled at 0.16 kHz, which results in a lower EER recognition (verification mode). The authors in [16] proposed a CNN-based biometric system, which works directly on raw ECG data, employing 100 subjects with equilong subsequence of 12s, resulting in Rank-1 accuracy of 97.00% using the BCIT database. The authors in [109] introduced a dynamical radial basis function (RBF) neural network-based ECG recognition system, using a FuWai ECG database consisting of 722 subjects, employing 12 ECG channels through 2 data recordings sessions. Here, 12 ECG channel is utilized, with the 1000-Hz sampling rate and 16-bit resolution, and the proposed method achieved recognition accuracy of 91.4%. The authors in [5] introduced a novel ECG-based biometric recognition system relying on deep CNN employing two public datasets to extract training and test sets, specifically, Intercity Digital Electrocardiogram Alliance-IDEAL (E-HOL-03-0202-003) database [110] and Physionet (PTB Diagnostic ECG) Database [30]. Here, the ECG signals from 52 subjects through 5 data recordings sessions using 12 ECG channels, with a sampling frequency of 1000 Hz divided into slots of 10 s and achieved a satisfactory EER performance of 2.90%. Table 2 summarizes the significant findings of the ECG based biometric recognition system with additional information based on the number of subjects, gender distribution of subjects, type of electrode used, acquisition frequency, average number of sessions, and average ECG lengths.
Generally, we can classify modalities in several ways based on the criteria. For example, modalities can be categorized as physiological (ECG, DNA, iris, fingerprint, earlobe), cognitive and behavioral (voice, keystroke, gait, and signature) [111]. The following are the characteristics of all of these biometric modalities [75], [112]: (a) Universality: Every person in the target population should possess the trait. (b) Uniqueness: The biometric trait should be sufficiently distinguishable across individuals. (c) Permanence: Based on the matching criterion, the biometric trait should be sufficiently invariant (i.e., stable and durable) over a long period. (d) Collectability: The biometric trait should be possible to acquire and digitize employing suitable devices that we can later use to authenticate a user. (e) Acceptability: the biometric identifier should have a broad public acceptance as an authentication or identification method, and the device employed for acquisition should be safe. (f) Circumvention: spoofing of the characteristic utilizing fraudulent schemes to defeat or bypass the biometric system should be challenging. Notably, ECG biometric modality, compared with other biometric modalities in Table 3, has proven to be the most promising, surpassing in most of the characteristics that describe the quality of a biometric modality [113]. Its unique nature makes it more robust to spoofing attacks than traditional biometric modalities, and the inherent liveness detection ensures that the ECG-based biometric system is not being overwhelmed [114]. Unlike anatomical biometric identities (such as fingerprints or facial features) that have two-dimensional data representation, the ECG is physiologically low-frequency signals that have a one-dimensional   [72], [113], [117]. data representation. Hence, ECG becomes a computationally more efficient alternative to the video or image-based biometric systems, particularly for continuous recognition systems [72], extremely subject to timely decisions.

1) GENERAL SCHEME OF ECG-BASED BIOMETRIC SYSTEM
Typically, the ECG-based biometric authentication process is split into three main functionalities: (a) Enrolment: It forms the initial process of acquiring ECG biometric data samples from a person and then creates a reference template depicting a user's identity used for later comparison [115]. (b) Verification: It renders a matching score between the biometric sample furnished by the user and his/her template [115]. The matching score is defined between 0% and 100% (with 100% being impossible to be realized).
(c) Identification/Recognition: It involves determining the identity of an unknown subject from a database of individuals [116]. Notably, the biometric system can then either attribute the identity corresponding to the most matching characterization found in the database to the unknown subject (or a list of the most matching profiles) or reject the subject. This work surveys more on scientifically relevant papers on biometric recognition system based on the ECG for individual identification and authentication.

IV. ECG SIGNAL DATABASES
ECG-based biometrics show promising recognition rates employing both short-term and long-term data. This increasing attention in ECG-based biometric has prompted the acquisition of ECG signals. However, most existing studies on ECG-based biometric systems for user authentication do not evaluate their design on large datasets, compared to other biometric modalities. In the following, we present some of the ECG that provide single-lead and multi-lead ECG signals for both on-the-person and off-the-person acquisition of subjects.

A. ON-THE-PERSON ACQUISITION
The on-the-person acquisition scheme pertains to devices that require to be fastened to the body of the subject, usually requiring conductive gel. This category encompasses devices that range from t-shirts to other wearable form factors. In the following, the first nine public databases are available for ECG-based biometric identification and authentication.

1) ODINAKA ET AL. [99] ECG SIGNAL DATABASE
This single-lead ECG signals database involves 269 (145 females and 124 males) subjects sampled from the general population on three separate occasions over a seven-month period. The authors employed the standard wet silver/silver chloride (Ag/AgCl) electrodes for measuring the ECG signals. Specifically, the authors use a simple recording montage, with electrodes placed bilaterally on the lower rib cage of the subject so that subjects are not required to undress. Besides, the ECG signals originally sampled at 10 kHz, were consequently down-sampled to 1 kHz and digitally notch filtered at 60 Hz to eliminate power line interference [99]. In the preprocessing stage, the authors reduced the resulting ECG signal to single 700 msec segments aligned to the respective R-wave peak, originating 200 msec prior to the peak. The segment (heart pulse) duration was adopted to guarantee that all of the major P, Q, R, S, and T-waves components were incorporated while reducing the likelihood of incorporating segments of adjacent beats. Subsequently, each ECG signal is then normalized by subtracting the mean and then dividing by the standard deviation [99]. Specifically, to align the ECG segments of each study participant to the peak of the R-wave, the peak has to be detected using initially a high-pass filtered applying an infinite impulse response (IIR) elliptical filter (5 Hz cutoff, 90 dB attenuation) and then an artifact-free calibration epoch of 15 s was selected [99]. In that fashion, positive inflection points (peaks at least 75% of the maximum detected peak amplitude) within this epoch were determined for further analysis.
2) WÜBBELER ET AL. [97] ECG SIGNAL DATABASE The work in [97] collects 234 ECG recordings from 74 subjects (Caucasian subjects, 40 male and 34 female within 19 years to 86 years) following a realistic framework for ECG biometrics by applying short measurements of 10 s length in unification with a practicable choice of ECG leads. The authors employed the standard wet Ag/AgCl electrodes for measuring the ECG signals. The sampling rate of the data employed in this investigation was 500 Hz with a 12-bit resolution. With reference to the extraction of biometric features, standard signal processing methods were used for preprocessing, which were not optimized. ECG traces of 10 s length from three recorded channels were baseline corrected by subtracting a moving median of 1 s width, and a low-pass filter with a cut-off frequency of 75 Hz was implemented to each channel. For beat detection, using the absolute value of the low-pass filter temporal derivative per ECG trace (via threshold procedure), the R-peaks positions were determined.

3) SHEN ET AL. [100] ECG SIGNAL DATABASE
This database collects short-term, resting, Lead-I ECG signals from 168 subjects (113 females and 55 males) of age range between 19 to 52 years employing the standard wet Ag/AgCl electrodes. The ECG was recorded for 90 s at a sampling rate of 0.5 kHz for the enrolment process, with a resolution of 12 bit. First-derivative-based detection technique, digital filtering, zero-crossing technique, and Pan-Tompkins algorithm were employed on raw ECG signals to detect PQRST fiducial points [100]. For R point detection, a reliable, real-time QRS detection algorithm was applied to limit the ECG bandwidth from 0.01 to 50 Hz. Specifically, the Pan-Tompkins algorithm was adopted to determine all the R points to calculate the R-R intervals [100].

4) ISRAEL ET AL. [34] ECG SIGNAL DATABASE
This database collects 29 ECG recordings from 17 males and 12 females between the age bracket of 22-48 years old with 12 repeat sessions for a total of 41 sessions within the data set [34]. The data were acquired using an on-the-person approach (standard wet Ag/AgCl electrodes) at a high temporal resolution, 1000Hz [34]. The frequency power spectra, which contains a combination of noise sources and subject information was filtered to remove the 0.06 and 60Hz noise while preserving the subject heartbeat information between 1.10 and 40Hz using a frequency bandpass filter between 2 and 40Hz [34]. Notably, the authors formulated the frequency bandpass filter employing the equivalent of a lower order polynomial, which works by allowing 'advantageous' bleeding of information into the processed data stream.

5) IRVINE ET AL. [22] ECG SIGNAL DATABASE
This database collects ECG signals from 43 subjects, 26 male and 17 female ranged in age between 18 and 48. During each session, the subject's ECG was recorded while performing seven 2-min tasks [22]. To determine session-to-session variations, many subjects were repeated using data from the subject's baseline state, which is regarded as a low-stress task. The authors performed attribute reduction by normalizing each ECG signal to fixed lengths of 250, 100, 50, 25, and 10 samples from the original 12-bit-1000Hz data. Subsequently, the raw ECG signal in [22] was Fourier bandpass filtered using the technique in [118] to eliminate electrical, thermal, and A/D noise sources. Next, the individual heartbeats were aligned by the peak of their R wave by computing the autocorrelation function of the ECG data stream and using that function to segment the heartbeats.

6) ZHANG AND WEI [96] ECG SIGNAL DATABASE
This database collects ECG recordings from 502 subjects using (On-the-person approach) the standard wet Ag/AgCl electrodes for measuring the ECG signals. Specifically, the authors acquired the ECG recordings (10 s long) by a 12-lead ECG device with a sampling rate of 500 Hz. The PCA methodology was applied to reduce the feature variables dimension [96], while the classification method was based on Bayes' theorem. Moreover, each ECG recording was divided into two segments the first is used to build the recognition model and the other is applied to test the ECG segment for identification. Furthermore, the difference method was applied to the detection of the R-peak and the QRS complex.

7) AGRAFIOTI AND HATZINAKOS [102] ECG SIGNAL DATABASE
The authors in [102] collect ECG recordings from the wrists of 52 healthy subjects ranged in age between 21 and 40 using (On-the-person approach) the Vernier ECG Sensor. Specifically, the acquired ECG signals consist of 3 min single-channel recordings, where the acquisition frequency was set to 0.26 kHz. However, authors in [102] did not specify the gender distribution of subjects. The authors in [102] preprocessed the ECG traces using a Butterworth bandpass filter, whose cutoff frequencies were set at 0.5 Hz and 40 Hz, to subdue the effect of the high-frequency components (powerline interference) and the low-frequency components (baseline wander). The authors developed a novel recognition method based on ECG signals, which enhances the autocorrelation (AC)/LDA feature extraction algorithm, by incorporating the periodicity transform (PT).

8) JANG ET AL. [98] ECG SIGNAL DATABASE
The authors in [98] collect ECG recordings from 65 subjects, 34 males and 31 females ranged in age between 22 to 48 years old using a single channel with a frequency of 1 kHz and a resolution of 12 bit. The authors used an adaptive filter on the raw ECG signal to eliminate electrical, thermal, and A/D noise sources before subsequently segmenting the ECG signal into individual beats and resampling to 100 Hz. Here, the amplitude of each heartbeat was normalized into a range of 0 and 1. Lastly, low-quality signals were discarded by the authors after a quality screening. For evaluating the quality of each segmented heartbeat, the authors designed a screening method, which improves the quality of the extracted signal for the identification task.
For the sake of completeness, we also discuss in the following the Physionet repository public databases originally acquired for clinical experiments in preference to biometric purposes.

9) MWM-HIT [119] ECG SIGNAL DATABASE
The MWM-HIT contains ECG records from 100 subjects. The length of each recording is 10 s. This database employs five different conditions to record the subject during the session i.e., sitting, standing, supine, exercise sitting, and exercise standing. Carewell ECG Workstation (PCECG-500) is the acquisition device used for the recording, where the acquisition frequency was set to 1 kHz. Four electrodes relating to the Left hand, Right hand, Left leg, and Right leg, are employed to capture the ECG from the body of the subject. The database collects 500 ECG recordings from the 100 subjects, i.e., 5 records per subject.

B. OFF-THE-PERSON ACQUISITION
The off-the-person acquisition scheme pertains to devices integrated into objects or surfaces the subjects interact with (e.g. a computer keyboard or a game station remote) rather than being attached to the body of the person. A significant benefit of the off-the person method is that the sensor placement does not require a voluntary user action compared to wearable on-the-person devices. These novel methods well aligned with the future trends envisioned in terms of biometric authentication

1) CHECK YOUR BIOSIGNALS HERE INITIATIVE (CYBHi) [104] ECG DATABASE
The CYBHI database in [104] which acquires 128 ECG recordings (2 min long) utilizing dry Ag/AgCl electrodes and electrolycra strips employing the off-the-person strategy is the extension of the database in [120]. Specifically, the ECG data was collected at the hand palms and fingers of the subjects. The protocol of the authors combines both neutral tasks and emotional elicitation tasks [104], which was introduced as a method of influencing intra-subject variability, to improve the accuracy of ECG-based biometric identification. That is, the authors concurrently acquired electrodermal activity (EDA) data as a way of rendering ground truth information regarding the arousal state of the subject, and can be utilized in correlation with the ECG data.

2) CHAN ET AL. [35] ECG DATABASE
This database collects ECG data from 50 subjects, 45 males and 5 females ranged in age between 18 and 40 during 3 data VOLUME 9, 2021  recordings sessions on different days, where subjects held two electrodes on their thumb and index fingers. Specifically, the ECG data were recorded employing a pair of half-inch wet Ag/AgCl held on the pads of the subject's thumbs utilizing their index fingers. The authors differentially amplified the acquired ECG signal using a high gain AC amplifier, such that the subject's right thumb signal was connected to the positive amplifier input, while the subject's left thumb signal connected to the negative amplifier input and common ground reference [35]. Here, the amplifier variable gain and bandwidth were set at 2000 and 1 Hz to 100 Hz, respectively. The notch filter available in the amplifier system was used to reduce power-line interference. The acquired ECG signal was sampled at 1 kHz, where the resolution was 12 bit. Each subject participated in three data recording sessions with a 90 second ECG data sequence recorded during each session. For each data sequence, PQRST complexes were detected using the multiplication of backward differences algorithm and then temporally aligned using a cross-correlation measurement.
3) UNIVERSITY OF TORONTO (UofT) ECG DATABASE (i.e., UofTDB) [103] The authors in [103] collect ECG signal recording from 1012 subjects, 398 males and 622 females ranged in age between 18 and 52 years to evaluate the performance of various ECG biometric methods. The authors used dry Ag/AgCl electrodes, placing the left thumb on the positive electrode, the right thumb on the negative, and the right forefinger on the reference electrodes with an acquisition frequency of 0.2 kHz and a resolution of 12 bit. The length of each recording ranged from 2 min to 5 min. The authors employed a fourth-order Butterworth bandpass filter (with cutoff frequencies of 0.5 Hz -40 Hz) to eliminate baseline wander and power line interference in the raw ECG signal. Moreover, the QRS complex detection was performed using the Pan and Tompkins methodology in [103], while each heartbeat was pruned to a length of 700 ms, with 200 ms before the R peak. Table 4 summarizes the databases found in the ECG biometrics literature.

V. ANALYSIS OF ECG SIGNALS FOR HUMAN IDENTIFICATION
Notably, ECG classification combines steps namely preprocessing, feature extraction, feature selection, feature transformation, and classification, as shown in Fig. 4.

A. ECG SIGNAL PREPROCESSING
Generally, ECG signal recordings/acquisition is contaminated typically by diverse noise sources and artifacts, which results in obtaining inaccurate R peaks. The noise sources and artifacts are characterized as interfering signals that emanate from anything that does not belong to the electrical activity generated by the heart. The presence of noise and artifacts may affect or even compromise the identification of a representative ECG signal. In the preprocessing step, the objectives are to maximally remove such noise and artifacts from the ECG signal to identify the following fiducial points: P onset , P peak , P offset , QRS onset , R peak , QRS offset , T peak , T offset , U peak , U offset . Since noise and artifact may result in incorrect biometric identification, ECG signal preprocessing and denoising become a discriminative need [121]. Accordingly, notable attention has been devoted in recent decades to design mathematical techniques and algorithms to extract noise-free ECG features from the noisy ECG data with an accuracy sufficient for biometric authentication. The ECG signal is first preprocessed, which enhances the compression ratio (CR) and percentage root mean-square (RMS) difference (PRD) [122]. Typically, preprocessed ECG signals comprise the following steps: mean removal, amplitude normalization, QRS detection, segmentation, period normalization, and zero padding [123]. Typically, the ECG signal x i is initially preprocessed to generate the signal y i [122]: where y i and x i denote the preprocessed signal and original signal, respectively, zeros(1, M ) denotes a row vector of M zeros, while A m and m x denote the original ECG signal maximum value and normalized signal mean, respectively.
In (1), m x takes the form The zero-padding technique extends the length of a series of numbers by adding zeros. Computing the discrete Fourier transform (DFT) of an ECG signal after zero-padding gives rise to a Fourier transform with additional interpolated values. That is, zero-padding a DFT at above Nyquist frequency gives rise to an inverse DFT with an interpolation of the original ECG signal [124]. While the frequency resolution remains unchanged after zero-padding, the spectral estimate is smoother with the addition of easier identification of spectral peaks. The discrete wavelet transform (DWT) and empirical mode decomposition (EMD) techniques are the two widely used techniques for ECG signals denoising [125]. ECG signal normalization and mean removal combine with DWT denoising algorithm enables a reduction in the number of significant wavelet coefficients, besides making the largest coefficient magnitude to be less than one [124], [125].

1) ECG FILTERING
A crucial part of any ECG processing algorithm is feature extraction. Feature extraction algorithms typically combine a preprocessing filter that decomposes the ECG into a signal which maximizes the signal-to-noise ratio (SNR) of the QRS complex [126]. Thus, the ECG preprocessing stage employs a filtering block to eliminate any existing artifact from the ECG signal. Typically, an ECG signal is firstly bandpass filtered with various frequency bands before interpreting it. Frequency domain analysis of heart rate variability (HRV) renders indispensable knowledge on cardiovascular control [127], which is significant in the ECG-based biometric identification. In human HRV signal, the three critical frequency regions include i) the very low-frequency band below 0.04 Hz, ii) the low-frequency band (0.04 − 0.15 Hz), and iii) the high-frequency band (0.15 − 0.5 Hz) [127], [128].
Filtering an ECG signal is achieved with the following filter modes: high-pass filters, low-pass filters, notch filters, bandpass filters, median filters, Savitsky-Golay filters, and adaptive filters.
(a) High-Pass Filters: High-pass filters (low-frequency cutoff) generally enable low-frequency signals by allowing only higher frequencies to pass through unaffected. They are employed to subdue low-frequency components in the ECG signal, namely motion artifact, respiratory variation, and baseline wander. Generally, the analog high-pass filter is less distorting than its equivalent analog low-pass filter as a result of the higher operating frequencies. Nevertheless, analog high-pass filters experience severe phase shifts impacting the initial fifth to tenth harmonics of the signal [129]. That is, a high-pass filter with a cut-off frequency of 0.5 Hz yet can affect frequencies up to 5 Hz. High-pass filters with cut-off frequencies of 0.5 Hz [130], [131], [138] [136] have been employed to eliminate baseline wander and to suppress baseline drift. In [137], high-pass filters of 0.05 and 0.5 Hz were employed, results in ''real-time mode'' revealed 93% alterations in the ST segment of the subjects only seen with the high-pass filter of 0.5 Hz but not with the 0.05 Hz high-pass filters. (b) Low-Pass Filters: Low-pass filters (high-frequency cutoff) on the ECG signals are typically employed to eliminate high-frequency muscle artifact, powerline interference, and other external interference [138]. Generally, low-pass filters only attenuate the ECG signal higher frequency amplitude components. While the analog low-pass filter but does not alter repolarization signals [129], they possess a remarkable influence on the QRS complex, epsilon, and J-waves. A Low-pass filters with the cut-off frequency of 15 (1 − 35) Hz [155], (5 − 35) Hz [156], (1 − 25) Hz [157], and (0.1 − 100) Hz [158], have been introduced. Notice that we consider the range of the band-pass filters as the range of undesired interference in the original ECG signal. Subsequently, the biometric system then potentially uses the filtered band signal as input. (d) Notch Filters: ECG signals are often exposed to severe power line interference typically, at 50 Hz or 60 Hz. A widely used approach to eliminate the power line interference noise is the use of a notch filter (a bandstop filter with a narrow stopband), but it comes with the risk of potentially distorting the preprocessed ECG signal. Specifically, the narrow-band notch filter with the notch centered at either 50 Hz or 60 Hz attenuates the frequency power within the respective stopband [159]. Conversely, the discrete Fourier transform (DFT) filter and Clean Line [159], [160] algorithm have been introduced as alternatives yet, they may fail to eliminate the power line interference of extremely fluctuating amplitude. In [161], a notch filter is introduced to cut off the 50 Hz power line interference, where they obtained a mean equal error rate (EER) of 2.75%±0.29 in authentication and a mean identification error of 5.61% ± 0.94. Additionally, notch filter set to filter frequencies at 50 Hz [35], [162] and 60 Hz [99], [163]- [165], have been employed to remove power line interference. (e) Median Filters: Median filtering is a non-linear signal processing technique utilized for smoothing signals (noise suppression) [78]. The key concept of the median filter is to run through the original ECG signal entry by entry, substituting each entry with the median of neighboring entries [166]. Ever since, numerous articles and books (see, for instance [167]- [169]) have investigated the median operation, analyzing its characteristics and introducing new and viable extensions. However, as the output of the median filter is invariably one of the input representations or samples, certain signals could likely pass through the median filter unchanged [170]. The median filter technique is employed mainly in ECG-based biometric systems for baseline adjustment or correction [78]. In [171] and [172], ECG signals are filtered employing two median filters that have 200 ms and 600 ms widths, respectively, to extract the baseline wander, and in [173] a median filter of 600 ms in width to suppress the T waves. (f) Savitsky-Golay Filters: The Savitzky-Golay(SG) filters are widely applied mainly for smoothing and differentiation in ECG signal processing. Notwithstanding their excellent characteristics, they are seldomly employed in ECG signal processing. In [174], the authors applied the Savitsky-Golay filter for preprocessing of ECG signal for R-peak detection and obtained an 0.026% (MSE), 99.98% (sensitivity), and 99.97% (accuracy). (g) Adaptive Filters: Since ECG signals are non-stationary, the use of the existing traditional filters of finite and deterministic coefficients to preprocess the original ECG signal is not efficient [177]. Therefore, adaptive filters on ECG signals can adapt the filter's coefficient based on the dynamic characteristics of the non-stationary ECG signals. However, adaptive filters yet have a drawback in that they require the representation of a noise model or desired signal model. The authors in [177] introduced an algorithm based on fixed-point convolution kernel compensation for determining a model for employing an adaptive filter, the results demonstrated improved performance in removing the noise from ECG signals. Table 5 comparatively summarizes the performances of the above-described filter types.

2) ARTIFACT REMOVAL, RESAMPLING, AND DIGITIZATION
Detection and reduction of noises and artifacts in the ECG signals are two of the greatest challenges for enhancing signal quality. Typically, three predominant sources of variability impact the interpretation of the morphological features contained within ECG, viz noise and artifacts, intra-subject variability, and inter-subject variability [74], [178]- [180], which must be therefore effectively removed.

3) NOISE AND ARTIFACTS
The ECG signals for biometric recognition in authentication applications may be potentially interfered with or compromised by the presence of noise and artifacts [181]. These are characterized predominately as interfering signals that emanate from anything outside the electrical activity produced by the heart. Hence, ECG enhancement is to isolate the original ECG signals from the undesired artifacts for easy interpretation. Several methods have been proposed for ECG enhancement such as independent component analysis (ICA) [182], advanced averaging [183], [184], adaptive filtering [185], SVD [186], maximally decimated perfect-reconstruction FIR filter banks [187], wavelet transform [188], [189], and non-linear filter banks [190]. Generally, one of the foremost challenges in the ECG-based biometric system is the separation of the desired signal from several types of noise such as baseline wander, power line interference, motion artifacts, muscle noise, and other interference [191], [192].
(a) Baseline Wander: Baseline wander is a slow-varying artifact [191], which essentially results from the skin-electrode impedance variation that emerges in the form of a low-frequency noise merged with the ECG signal [193]. Impedance variation can manifest as a result of the individual breath, the electrode-skin contact, and smooth movements [192]. Moreover, baseline wander is a typical artifact that corrupts the recorded ECG signals and stems from respiration at frequency wandering within 0.15 − 0.3 Hz, which can be filtered using a standard high pass digital filter [194]. Typically, linear filtering and polynomial fitting are the two primary methods used for the elimination of baseline wander. Here, the linear time-invariant highpass filter design requires the problem of determining the filter cut-off frequency (approximately F c = 0.5 Hz, where F c denotes the cut-off frequency) and phase response characteristic [194], [195]. The authors in [195] proposed the application of discrete wavelet transform (DWT) for noise and wandering suppression in ECG signals, which enable inspecting high-frequency situations of short duration in nonstationary signals. Some researchers have also adopted EMD for baseline wander reduction [191], [196]- [198], [200], which requires a priori knowledge of baseline wander behavior. It is shown in [141] that baseline wander noise may deviate the amplitude of ECG signals for biometric recognition by up to 50%. Thus, the drift of the baseline can be modeled as amplitude modulation (i.e., a non-stationary signal with time-varying amplitude and frequency) to the ECG signal. VOLUME 9, 2021 (b) Power-Line Interference: Power-line interference is another predominant ECG signal artifact that significantly alters the ST segment, degrades both the signal quality and frequency resolution, causing large-amplitude in ECG signals that can resemble the P-Q-R-S-T waveforms [201]. Hence, these factors cover small features that are essential for ECG-based biometric recognition for human authentication. Power-line interference (PLI) noise usually arises in ECG signals that consist of sinusoidal oscillations with the fundamental PLI component of either 50 Hz or 60 Hz and its harmonics [202]. Generally, a common choice of eliminating power-line interference is using an adaptive filter, which can adjust its coefficients based on an appropriate algorithm. A traditional approach to eliminating power line interference is to employ a notch filter, tuned to the interference frequency [201]. However, the challenge of using an FIR notch filter is that the bandwidth of the notch is relatively large [203], which attenuates the required signal components within the bandwidth. In [204], since the EMD possesses an adaptive and signal-dependency property, a robust power-line interference elimination system according to the extended Kalman filter and the modified EMD has been applied to attenuate the ECG signal QRS complex. The least mean square (LMS) algorithm proposed in [205] and [206], have been broadly employed in adaptive filtering algorithms for the elimination of powerline interference. The adaptive filter typically measures the difference between the desired signal and the adaptive filter output [207], which it employs to algorithmically tune its coefficients to minimize the cost function of this difference. That is, the error signal becomes zero when the adaptive filter output equals the desired signal. Mathematically, at each iteration of the LMS algorithm, each filter tap weight is updated based on the following weight update equation: where e(n) is the adaptive filter output in the nth iteration, T is the adaptive FIR filter tap weight coefficient vector at time n with filter length N , and µ is the step size parameter that is to be properly selected. (c) Motion Artifacts: The electro-conductive fabric as a textile electrode is usually employed to collect physiological signals based on textile materials [208]. However, in regards to its sensitivity variation (caused by impedance variation), it is challenging to acquire the ECG signal due to the motion of the subject. There are several studies on overcoming motion artifacts by addressing it as mere sinusoidal noise in the conventional ECG electrode [13], [191], [209], [210]. Over recent decades, conventional signal-processing methods have been used to overcome motion artifact. These conventional signal-processing methods include: moving average filtering, wavelet transform, and finite impulse response/infinite impulse response (FIR/IIR) high-pass filtering [209]. Recently, adaptive filtering has been demonstrated to be beneficial in motion artifact reduction [185]. Based on the noise sensitivity of different QRS complexes detection, the authors in [211] observed that ECG electrode motion causes a variation in electrode-skin impedance, which influences baseline variations with its duration of 100 ∼ 500 ms. Moreover, [212] introduced a real-time QRS complexes detector and considered the bandwidth of motion artifacts to be below 5 Hz sinusoidal wave and then applied a filter employing over-sampled (2 kHz) ECG signal for improved timing-resolution. There are also several studies on overcoming motion artifacts in the ECG signal utilizing more complex conditions than the conventional electrode via electroconductive fabrics. The authors in [213] modeled motion artifacts as the difference between motion free-signal (i.e., ECG from the conventional electrode) and motion added signal (i.e., ECG from e-textile) with a 5 Hz maximum frequency. The authors in [214] used an injection current combined with an adaptive filter to reduce motion artifacts in capacitive ECG measurements occurring in the frequency band of the ECG without requiring knowledge about the measurement system. Here, the amplitude of the motion artifact is reduced on average by 29 dB in simulation and by 20 dB in a lab environment. (d) Muscle Noise: The presence of muscle noise draws a major problem in several ECG applications, especially in recordings acquired during exercise, as low amplitude waveforms may become completely covered. Specifically, the ECG signal muscle noise components get very large, owing to muscle contraction, and becomes very challenging to be filtered out from the ECG signals, thus damaging the signal characteristics essential for biometric recognition [215], [216]. For instance, larger amplitude muscle artifacts cover the small-amplitude P-waves and make it difficult to establish the presence or absence of these waves, which affects biometric recognition. Besides, as the muscle noise spectral content overlaps the ECG signal spectral content, it becomes challenging using digital filters alone to improve the signal-to-noise ratio (SNR) without adding significant ST-segment region distortions [215]. It is reported in [217] that the muscle contraction noise which arises in the ECG signals as additional ''bursts'', can typically be modeled as a zero-mean, band-limited Gaussian noise. A general solution to extract the ECG signal for the wideband noise, which may be induced by muscle artifact, has been to pass the ECG signal over a low pass filter possessing a low cut-off frequency, usually around 25 Hz [218]. However, the adoption of a low pass filter has the drawback of additionally subduing the QRS complex amplitude since the high-frequency ECG signal components, which are essential to represent the QRS complex relatively high peaks, are also eliminated by the filter.

4) INTER-SUBJECT VARIABILITY
Inter-subject variability is the variability between ECGs from different individuals. The ECG signal uniqueness can be assumed to be acquired mainly from the uniqueness of DNA [219], besides other physical factors such as age, race, and gender [220], which contribute to the different ECG variations. For instance, the QRS complex amplitude tends to increase from birth to adolescence and then gradually begins to decrease afterward [221]. The authors in [222] and [223] also find that the PR interval increases slightly with increasing age. Studies have shown that the amplitudes of the S wave in ECG signals are lower in women than in men between the age bracket of 18 − 40 [220]. While gender differences in ECG signal parameters are more evident in young adulthood, they are known to decrease their effect afterward. While ECG signals reflect the activities of the heart and exhibit an inter-subject variability property, they can be employed as a biometric-based modality for identification/verification purposes. Since ECG signal is universal, stable, and unique, the inter-subject variability appears as ECG portrays the myocardial electrophysiological variations influenced by heart mass orientation, cardiac muscle conductivity, and activation order [224]- [226]. That is, notwithstanding the desired inter-subject variability (uniqueness), the ECG signal should be sufficiently stable over time (subdue intra-subject variability) to enable ECG-based biometric authentication.
The pioneering works of [194] and [225] analyze in-depth the inter-subject variability of ECG signals required for human authentication.

5) INTRA-SUBJECT VARIABILITY
The perfect biometric modality 3 should possess a very low intra-subject variability besides having both a very high inter-subject variability and stability over time [227]. The variability between different ECG signals from the same individual or variability within one ECG (the latter is also called beat-to-beat variability) is termed the intra-subject variability [228]. The ECG can be contaminated by several sources of intra-individual variability, shadowing the underlying cardiac state and limiting the accuracy of ECG interpretation. Generally, among several significant sources of intra-subject variability in ECG signals are chest electrode position variability and respiration variability [115]. While chest electrode position variability in ECG signal induces variation between ECGs of the same individual [229], respiration induces variability within a particular ECG [115]. Noise and artifacts may induce intra-subject ECG variability [230]. However, it possesses the benefit that it is usually obviously evident on the ECG, and it influence is widely known. Moreover, besides chest electrode position variability, intra-subject variability may also be induced by physical activities [231], [232], emotional states, drowsiness, and pharmaceutical drugs [233]. This may reveal essentially in the heart rate variability, altering the morphology of the P-R and S-T segments [234], [235]. Notably, studies in [227], [230]- [235] have shown that intra-subject viability in ECG signals is the source of uncertainty that has prompted a primary setback in the application of the ECG signal as a biometric trait. There are several general techniques used for Artifact removal from ECG recordings. For example, when the frequency bands and interferences in ECG signal do not overlap [237], the use of simple low pass filter, bandpass filter, or high pass filter (i.e., FIR or Butterworth filters with a cut-off frequency of about 30 Hz [236]) are effective methods for artifact removals. However, interferences with a broad spectral distribution, such as muscular activities, overlap with that of the ECG signal spectrum. Consequently, this makes such high-pass filtering (or other frequency filter types like consecutive notch filters) extremely challenging, since it alters the ECG signal frequency content, changing outcome measures such as mean frequency and mean amplitude. While proper normalization can -in part -compensate for the high-pass filtering effects on ECG signal amplitude (assuming the frequency distribution continuously changes over activation levels), it is still challenging in investigations concerning muscle fatigue. Other techniques such as wavelet decompositions, EMD on additional classical theoretical methods like filtering [238], ocular artifact correction [239], regression [240], have also been successfully applied for ECG artifact removal [241], [242]. The authors in [237] consider the following single-channel ECG recorded signal, s with additive artifacts v of the form:  [237] is to filter v from s in the wavelet domain using wavelet-based artifact removal, with minimum a priori knowledge on v, x, and η. Basically, in the preprocessing step, filtering techniques are employed to preprocess the original ECG signals and have been used in diverse systems for ECG analysis. Since noise may result in wrong biometric authentication, ECG signal denoising is required. Therefore, considerable attention has been given over recent decades to design mathematical methods and computation algorithms to pre-process the ECG signal to remove noise with an accuracy adequate for biometric authentication. Existing literature [35], [99], [161]- [165], [196], [199], [200], [243]- [254] comprises several denoising techniques for an ECG signal. For example, the authors in [248] introduced an adaptive spectro-temporal filtering method for ECG signal improvement. In [245], [246], and [247], Wavelet-based filtering VOLUME 9, 2021 methodologies for noise reduction in ECGsignals, are presented. While the authors in [249], [250] introduced the non-linear Bayesian filtering-based methodologies for ECG denoising, the Kalman filter-based methodology of ECG denoising was instead adopted by the authors in [251]- [253]. It is worth noting that several challenges exist by providing automatic noise filtering since several filter bank-based methods affect the ECG signal P-waves and R-waves [251], [252]. In [243], an identification technique based on Hilbert vibration decomposition (HVD) is proposed for ECG signal to correct the baseline wander, where the authors found that the first decomposed component (highest energy component) corresponds to the baseline wander noise. In [196], [199], [200], EMD-based algorithms were developed for baseline wander noise correction. The authors in [244] used fractal modeling to propose a projection operator-based method for baseline wander removal and applied a hybrid scheme of EMD method and wavelet analysis to remove powerline interference. The authors in [254] introduced a detrending method-based scheme (introduced originally for eliminating slow non-stationary drifts from heart rate variability) to remove baseline wandering in ECG. Notably, the vast majority of baseline wander removal methods have in common that they remove the low-frequency components of the ECG signal. However, research finds that the baseline wandering noise components in ECG signals may reside at higher frequencies requiring more intrusive filters with higher cut-off frequencies. In this way, they can likewise affect the ischemia-induced changes in the ECG ST-segment, compromising its biometric modality. Thus, the reason why it has been a long-standing practice employing a high-pass filter with a frequency cut-off not greater than 0.05 Hz for baseline wander removal. Notch filter-based approaches have widely proved advantageous over other methods for power-line interference cancellation in ECG signals [35], [99], [161]- [165]. However, using a notch filter tuned to the pulse frequency of 50 Hz or 60 Hz to suppress power line interference [255], the elimination may be delayed with a ''rebound'' or ringing artifact before onset and offset events [256]. Besides, proper tuning for bandwidth selection employing notch filters remains challenging [255], [256]. The authors in [257] introduced a robust framework for ECG-based power line interference removal by adopting an adaptive notch filter combined with a discrete-time oscillator and modified recursive least square (MRLS) scheme for ECG recordings. The work in [201] discusses several potential EMD-based adaptive filtering methods and reduction methods for power line interference cancellation in the ECG signal. Since researchers have introduced the use of notch filters and adaptive cancellers for power line interference suppression, the authors in [258] proposed an improved adaptive canceller for power line interference suppression in ECG-based signals via the LMS estimation methodology. In [259], a technique for eliminating power line interference in ECG signals, adopting an adaptive noise-canceling filter, is presented. In [260], two different hybrid signal processing schemes, namely i) EEMD-BLMS (ensemble EMD (EEMD) combined by Block LMS (BLMS)) adaptive algorithm and ii) Wavelet neural network (WNN) (discrete Wavelet transform (DWT) combined by the neural network), have been applied for baseline wander and power line interference suppression. Another EEMD-based method for removing power line interference in noisy ECG recordings is introduced in [261], where they decomposed the noisy ECG signal into intrinsic mode functions (IMFs) via EMD. The work in [262] attempts to reduce the number of required components in filter implementation and then introduced nonrecursive FIR filters (NRFIR) for removal of base-line wander and power line interference from the ECG signal. Comparatively, the methodologies based on either EMD and wavelet domain are effective for powerline interference suppression in ECG signals [262]. As the ECG signal is relatively weak and vulnerable to several noise artifacts, the thresholding realized in either EMD or wavelet domain straightforwardly will lead to insufficient denoising, particularly in biometric authentication applications. However, in the ECG denoising approach based on noise reduction in hybrid EMD-Wavelet methodology introduced in [263], denoised using wavelet threshold cannot discriminate between high-frequency noise and the QRS information. To overcome the challenge, the authors in [197] introduced an ECG denoising method adopting noise reduction algorithms in EMD and wavelet domains. To preserve the QRS complex, windowing in the EMD domain is proposed in [197], to reduce the noise from the initial IMFs, instead of discarding them completely, thus yielding a relatively cleaner ECG signal. Yet, the prior methods require different techniques for powerline interference and baseline wander removal, which often leads to some loss in the underlying ECG signal structural information in the improvement process. Thus, finding the filters capable of simultaneously removing both the baseline wander and powerline interference without compromising ECG signal morphology is a relevant task. The authors in [264] introduced an iterative method to decompose a multi-component nonstationary ECG signal into mono-component signals based on repeatedly performing eigenvalue decomposition (EVD) on the Hankel matrix (HM) (i.e., EVDHM). The results in [264] show that, unlike EMD, the EVDHM approach can separate constituent mono-component signals that are not influenced by their mean frequencies ratio nor by their relative amplitudes. This EVDHM method has been applied in speech signal processing in [265] [99]. It is worth mentioning that, at the original sampling rate, the phase angles corresponding to the ECG waveform feature frequencies are small, which would lead to inaccuracies in biometric identification. Down-sampling improves the phase angle resolutions and makes it more straightforward to classify the corresponding ECG waveform feature required for biometric identification. In [269], employing linear interpolation, the original 1,000 Hz ECG signals were down-sampled (or resampled) to 500 Hz, 250 Hz, 100 Hz, and 50 Hz sampling frequencies. While resampling to 500 or 250 Hz in [269] leads to best concordance, resampling to 50 Hz proved unsatisfactory for both time-and frequency-domain analyses. Specifically, at 50 Hz, the root-mean-square successive differences (RMSSDs) and the high-frequency power (expressed in absolute and normalized units) have a propensity of high values and random errors. However, in [269], ECG signals downsampled to 100 Hz yielded acceptable results for time-domain analysis and Poincaré plots, but not for frequency-domain.

6) ECG SIGNAL NORMALIZATION
Notably, the use of various acquisition equipment or the interaction of the subject with it may cause differences in the signal amplitude and DC offset voltage in ECG [22]. Additionally, heart rate variability, a physiologically inherent variation in heartbeat durations, is modulated by several physiological factors. Thus, numerous researchers have introduced amplitude and time normalization algorithms for ECG-based biometric to address these concerns as amplitude or time normalization is necessary for identification [20], [91], [247], [271]. ECG signal normalization entails scaling the ECG signal amplitude to the same peak amplitude. Moreover, in ECG biometrics, the ECG signals suffer from inter-subject and inter-session (intra-subject) variability; hence, obtaining latency and amplitude invariant set of features becomes essential [226]. To address the emerging challenges, time and amplitude normalization is performed by re-scaling each segment to the corresponding number of samples and amplitude. Time normalization has been very less commonly employed. The authors in [272] introduced a method of normalizing ECG signals to a standard heart rate, to lower the false rate detection. In [273], the authors normalized the QRS complexes to extract the salient features and to reduce error discrepancies and, they find that the normalized convoluted result exhibits waveforms comparable to QRS complexes.
In the following, we present several commonly used normalization techniques for preprocessing ECG signals.
(a) Min-Max Normalization: Based on amplitude normalization, the authors in [274], propose to normalize the heart-beats to have a minimum value of zero and a maximum value of 1 in the following formulation: where y and x denote the normalized and the original segments, respectively, while max and min are the maximum and minimum values of the feature dimensions of x[n], respectively. The authors in [22], [114], [275], [276], [276] also used the min-max normalization rule in (5) to map the original data to 0 − 1 by a linear transformation. (b) Max-Div Normalization: The authors in [226] and [278] take the segmented time-normalized signals and normalize them using a normalization factor of the average R-peak amplitude value. Which specifically reads: where y and x denote the normalized and the original segments, respectively, while max is the maximum values of the feature dimensions of x[n]. (c) Z-Score Normalization: The authors in [99] introduced a z-score approach to normalizing the heartbeat segments by initially subtracting the signal mean and then dividing the result by the standard deviation: The Z-score normalization technique reads: where y and x denote the normalized and the original segments, respectively, while µ and σ are the mean values and standard deviation of the ECG signal, respectively. The author in [279] also used the Z-score normalization rule in (7) to remove the amplitude scaling problem present in the ECG signal x[n]. (d) Median Normalization: The median normalization technique adopts the median and median absolute deviation (MAD) normalization rather than the mean and standard deviation employed in z-score normalization [280]. The median normalization technique reads: where y and x denote the normalized and the original segments, respectively, log 10 is the logarithm base 10 operator, . is the floor operator which rounds to the nearest integer below its current value, and max is the maximum values of the raw ECG signal x[n]. VOLUME 9, 2021 where y and x denote the normalized and the original segments, respectively, while µ G and σ G are the mean and standard deviation of the genuine matching score distribution of the raw ECG signal x[n], respectively, as given by Hampel estimators. The estimator in (10) is based on the following influence ψ-function: where The Hampel [282] influence function decreases the influence of the scores at the distribution tails (distinguished by a, b, and c) throughout the estimation of the location and scale parameters. Thus, this technique becomes less sensitive to outliers.
where m is the reference point chosen (i.e., some value falling in the region of genuine and impostor scores) and the parameters r 1 and r 2 is the left and right edges of the region in which the function is linear. In (13), r 1 and r 2 are chosen as, r 1 = m − min(x[n]) and r 2 = max(x[n] − m). Table 6 presents a summary of the characteristics of the different normalization techniques.

B. ECG SIGNAL FEATURE EXTRACTION CATEGORIES
There are diverse methods of feature extraction that have been proposed for ECG-based biometrics [95]. We provide an overview of the existing methods on ECG-based feature extraction (i.e., handcrafted 4 and non-handcrafted feature extraction) techniques for biometric authentication. Specifically, to contribute to the ECG-based biometric investigation, we summarize, in Table 7, feature extraction modalities from existing studies using evaluation metrics like EER, accuracy, false accept rate (FAR), and false reject rate (FRR). In recent decades, several types of handcrafted feature extraction methods such as fiducial feature extraction, DCT, auto-correlation, and wavelet transform have been used to extract the signal features for the classification problem for ECG-based biometric authentication and identification.

1) HANDCRAFTED FEATURE-BASED ALGORITHMS
There are two major types of handcrafted feature extraction methods, one being fiducial based and the other being non-fiducial based. In [22], [266], [271], [284]- [290], handcrafted feature vectors were extracted from the QRS complex of heartbeats as this region is deemed to contain most ECG signal information. However, research investigations in [189] demonstrate that the P, Q, R, S, and T peak wave also holds significant information. Compared with the P, Q, R, S, and T peak wave, the QRS corresponds to a higher polarisation event over a shorter period [73]. Hence, the QRS is more dominant over noise and intrasubject variability than the other ECG waveforms, making it better suited for biometric recognition.
The fiducial features-based algorithms employ the ECG beats characteristic points (onset and offset) of the P, QRS, and T waves, the time difference between the peaks of the Q and T waves, and the QT interval in a single ECG beat or segment. Existing works have employed diverse subsets of these fiducial features [284], [291], [292]. While the non-fiducial feature extraction method does not employ characteristic points for the feature set generation [284], the algorithms depend on thoroughly analyzing the ECG signal, usually via applying time or frequency analysis to acquire other statistical features. Generally, the non-fiducial feature-based algorithms extract discriminative information from the ECG waveform and eliminate the need for fiducial point localization for biometric recognition. Existing works have employed diverse subsets of these non-fiducial features like Wavelet transform [11], [293]- [295], autocorrelation [287], [288], [295], DCT [23], [289], [296], and normalize-nonvoluted normalize (NCN) [266], [293], [297].

C. ECG SIGNAL FEATURE SELECTION
Generally, an essential phase in ECG data preparation, which is one of the significant challenges in the development of the classification model, is feature selection. Feature selection methodology is an approach to choosing a minimum subset of features from the original set of features to optimally reduce the feature space dimension without affecting the classification accuracy. Generally, there exist three methods of solving the feature selection problem: filters, wrappers, and embedded methods [303], [304]. While the filter-based approach selects features by statistical properties via a filter approach, the learning model performance is not usually as high as that of the wrapper approach since the feature selected may not be the optimal one feasible [303]. However, the filter-based approaches are readily scalable to highdimensional datasets, computationally uncomplicated, and fast [304]. Figure 5 illustrates the filter-based feature selection methods. The wrapper technique employs optimization algorithms in the learning machine methodology to determine the optimal subset of features. Here, the search algorithm is wrapped around the classification model, which provides a feature subset that can be evaluated by the classification VOLUME 9, 2021 algorithm [304]. Moreover, wrapper methods typically consider feature dependencies and accommodate interaction between feature subset search and choice of a learning model but are computationally challenging concerning filters. Illustration of the wrapped-based feature selection method is depicted in Fig. 6. Some representative examples of wrapper methods are forward feature selection, backward feature elimination, recursive feature elimination. On the other hand, the embedded method builds an optimal feature subset search into the classifier framework. That is, it combines feature selection into the classifier training process. Under the embedded method, combining feature selection into the classifier training process makes them specific to the utilized learning model, just like the wrappers, with the advantage of being less computationally challenging than wrappers [304]. While filters-or wrapper-based feature selection technique is more commonly used separately in the literature, several other studies have combined the filter and wrapper feature selection schemes [305], [306].
In [305] and [306], the wrapper-based feature selection method has been proposed to reduce the dataset dimension and improve classifier accuracy. These works differ in three applied materials: datasets, classifiers, and feature selection methods. The authors in [307] and [308] introduced a hybrid genetic algorithm (GA)-based feature selection methods. A Parallel GA-based feature selection optimization has been introduced in [309]. In [310], a heuristic search method has been employed for the feature selection problem. In [311], a multi-objective evolutionary algorithm (MOEA) technique has been introduced for feature selection. A genetic algorithm for feature selection in binary classification has been utilized for dimensionality reduction to improve ECG data classification performance in [312]. The authors in [313] used a normalized mutual information-based feature selection wrapper embedded with kNN machine learning model classification to improve the classification accuracy.

D. ECG SIGNAL FEATURE TRANSFORMATION
Feature selection is often overlooked with dimensionality reduction (sometimes known as feature transformation) [73], [78]. Both methods tend to reduce the number of attributes in the dataset. While feature selection reduces the feature dimension by including and excluding attributes present in the data without modifying them (i.e., selecting a more discriminative subset among an initial feature set), the feature transformation (or dimensionality reduction) method achieves this by creating new combinations of attributes [73], [78]. Some examples of feature transformation methods are PCA, SVD, LDA, and ICA.

1) PCA FOR DIMENSIONALITY REDUCTION (FEATURE TRANSFORMATION)
The PCA is a classical unsupervised dimensionality reduction method that learns a projection matrix such that the variance of low-dimensional data is maximized [314]. The technique is linear in that the components are linear combinations of the original feature variables. However, for effective visualization, the non-linearity in the data is preserved. Large datasets are increasingly common and are often challenging to interpret. Principal component analysis (PCA) is a technique for reducing the dimensionality of large datasets, increasing interpretability but at the same time minimizing information loss [314]. The most popular problem of unsupervised algorithms is that the label information is not used in the classification task. However, using labeled information of the learning data sets to create more efficient methods, such as supervised dimensionality reduction for ECG-based biometric authentication, improves detection accuracy [315], [316]. See, for instance, in [317] for comprehensive history and treatment of PCA. In [318], the authors employed the PCA method for feature reduction of ECG signals by assuming that the set of attributes can be split into subgroups of similar characteristics and then subjected to PCA. An extension of the PCA, termed Kernel PCA, is a non-linear generalization that corresponds to PCA realized in a reproducing kernel Hilbert space associated with a positive-definite kernel. The experimental results in [32] show higher test recognition rates of Gaussian one-against-all (OAA) SVMs on random unknown ECG data sets with the use of the Kernel PCA as compared to the use of the LDA and PCA.

2) SVD FOR FEATURE TRANSFORMATION
Generally, singular value plays a significant role in the generation of various hash functions [215]. The singular value decomposition (SVD) of a matrix X is the factorization of X into the product of three matrices. That is, matrix X ∈ C N ×M is decomposed into the SVD in the form [63], [64], [215] where columns of U ∈ C N ×N and V ∈ C M ×M are, respectively, left-and right-singular vectors for the corresponding singular values. Here, SVD has decomposed the signal into singular matrix as U , ∈ C N ×M and V sub-matrixes, where is known as the diagonal singular matrix. SVD decomposes the original matrix into a non-correlated variable matrix and thus enables dimensionality reduction based on their low rank of singular ( ∈ C N ×M ) matrix.

3) LDA FOR FEATURE TRANSFORMATION
LDA is a popular supervised feature transformation methods for dimensionality reduction. However, whenever its distance criterion of the objective function uses 2 -norm, it is sensitive to outliers [320]. However, LDA suffers from several drawbacks discussed in the following: (a) The first drawback is that conventional LDA is incapable of handling multi-modal data, whose distribution is more complex than Gaussian. Several methods have been proposed in the literature to overcome this issue. For example, the authors in [321] introduced a pairwise formulation of LDA, so-called neighborhood MinMax projections (NMMP), which strives to pull the considered pairwise points within the same class as close as possible while those between different classes as far as possible. Moreover, the method in [321] [323], which makes it challenging to manage small-scale data with high dimensionality.

4) ICA FOR FEATURE TRANSFORMATION METHOD
Contrary to PCA, ICA identifies non-Gaussian components, and the goal is to linearly transform the data structures in such a way that variables after transformation are independent of each other [78]. Since ECG signals are recorded usually in a high-dimensional space, classification rules in the high-dimensional feature space are difficult to learn and time-consuming. Hence, several ICA algorithms, for instance, Infomax, FastICA, and second-order blind identification (SOBI) have been introduced. These components are statistically independent, i.e., there is no overlapping information between the components. While the ICA involves high order statistics, the PCA involves second-order statistics by constraining the components to be mutually orthogonal. Thus, makes the PCA and ICA frequently select different subspaces to project the data. While the ICA has been proposed as an alternative to PCA, it suffers several challenges owing to instability, the choice of the number of components to extract, and high dimensionality. However, for high-dimensional ECG data sets, the PCA is usually employed as a preprocessing step to reduce the feature set dimensionality. Besides, the ICA can subsequently be applied to the subset of data summarized by a small number of PCs from the PCA.

E. ECG SIGNAL CLASSIFICATION
ECG signal classification plays a crucial role in ECG-based biometric identification for human authentication, and it entirely relies on the extraction of features from ECG waveforms. Classification of ECG signals is a challenging problem owing to problems associated with the classification process [78]. The significant problems in ECG classification tasks include lack of standardization of ECG features, the nonexistence of optimal classification rules for ECG classification, the individuality of the ECG patterns, variability amongst the ECG features, and variability in the subject ECG waveforms [78]. It is worth noting that an ECG-based identification system initially requires an enrolment phase, which serves to acquire and store the subject's unique attributes.
Here, specific preprocessing for noise and artifact rejection, including feature extraction, are realized before the data storage. After the features of distinctive subjects are stored, the identification phase can commence. During signal identification, the unknown ECG introduced to the system requires preprocessing initially to remove the noise, and then feature extraction/transformation is performed subsequently, as in the enrolment phase [115]. Moreover, a specific classification algorithm assigns the extracted features to the best matching subject's data, as stored in the database. In the following, we concentrate on classification strategies for ECG-based human recognition.

1) ANN CLASSIFICATION MODELS
ANNs, an established biologically-inspired paradigm, is a promising machine learning method in classifying non-linear ECG signals for biometric recognition. ANN uses diverse methods in its implementation processes, such as supervised learning, unsupervised learning, or reinforcement learning. Several researchers have adopted diverse models of neural networks for ECG signal classification. The ANN models are data-driven, self-adaptive, non-linear, accurate, fast, robust to noise, and easily scalable [78]. Some of the major benefits of ANN includes: • It renders a non-linear mapping between inputs and outputs utilizing activation functions such as sigmoid and can be applied to solve the non-linear problem such as classification of ECG signals [78]. VOLUME 9, 2021 • It can achieve comparable or better results than statistical or deterministic methods. It is worth noting that since statistical methods are designed based on the assumption of given linear time series, it cannot yield reliable results for the non-linear problem but performs well for linear problems.
• The ANN methodology can adaptively model the ECG low frequencies components, which are inherently nonlinear.
• Using ANN enables the easy removal of the ECG signal time-varying and non-linear noise characteristics. Some of the major drawbacks of ANN includes: • The ANN training algorithm is typically unable to guarantee a global minimum set of weights.
• ANN typically may not necessarily furnish an optimal solution for the entire 12-lead ECG classification process [78]. Prior hardware-based ANN realization often uses a finite state machine (FSM) combine with a generic arithmetic logic unit (ALU) to realize neurons for the feed-forward computation and then reuses the neurons in hidden layers and output layer for feed-forward computation. The most popular and generally utilized learning algorithm applied for ANN weight estimation is the backpropagation (BP) algorithm. The general practice for updating weights is: W ji = ηδ j O i where [332]: • η denotes the learning rate (real number), which defines the gradient descent search step size. Setting a larger value for the learning rate may aid the network to converge faster. However, due to the gradient's larger step size, the oscillation problem may occur and cause divergence, or in some cases, overshooting the minimum. On the other hand, by setting a lower value for the learning rate, will aid the gradient to move in the correct direction and gradually approach the minimum point. However, the convergence rate is compromised, resulting from the smaller steps taken by the gradient.
• O i denotes the output calculated via the ith neuron.
for the output neurons, where D j the desired output for the neuron j and neurons. Specifically, ANNs attempt to solve linear classification and non-linear classification problems using backpropagation are powerful learning architectures. The various ANN architectures commonly applied in the ECG classification field includes: (a) Complex valued ANN (CVANN): CVANN, whose parameters (i.e., weights, threshold values, inputs, and outputs) are all complex numbers, find application in fields dealing with complex numbers such as ECG-based biometric recognition. The benefit of adopting CVANN rather than a real-valued ANN (RVANN) counterpart is well-known [75]. However, the selection of nodes activation function in CVANN is a challenging problem. The authors in [333] applied the CVANN for ECG signal classification, where achieved accuracy rates of 99.8% (averaged) and 99.2% for the first and second classification tasks, respectively. In [334] the authors proposed CVANN for ECG signal classification, which achieved a 100% accuracy rate using a 3-level based complex wavelet transform.

(b) Backpropagation neural network (BPNN): The BPNN
algorithm is the most common supervised learning algorithm and most extensively applied and popular method to optimize the feed-forward neural network training. The seminal work of [42] introduced the automated recognition in which a BPNN classifier with time-domain name functions of each beat extracted from a 12-electrode ECG is employed. The work in [335] proposed an automated ECG recognition method based on a BPNN, which exhibited a steady precision of more than 99% recognition of ECG signal.

2) DEEP LEARNING METHODS APPLIED AS CLASSIFICATION
Deep Learning (or deep structured learning), a subset of artificial intelligence and machine learning, has been used lately in ECG signals for classification purposes. Deep learning surfaced in the works of [336], [337] with DBNs 5 within a framework machine learning system that employs multiple layers of non-linear information processing for supervised or unsupervised classification and can be used also for feature extraction and transformation. Specifically, deep learning methods aim at learning feature hierarchies with higher-level learned features defined in terms of lower-level features. Based on the successful deep learning architecture, recent algorithmic improvements in ECG classification are mainly devoted to a DNN. While diverse works have presented some performance improvements, they still depend on a comparatively direct application of DNNs. Training method and model architecture improvements for ECG have not rigorously been investigated, which allows for expansion and modification. After the development of DBNs, several other unsupervised deep learning models have been proposed to improve the performance of ECG classification tasks. For example, (1) sparse autoencoder network [336], learns sparse overcomplete features. It uses a linear encoder and a linear decoder, preceded by a sparsifying non-linearity, that turns a code vector into a quasi-binary sparse code vector, and (2) an autoencoder-based greedy layer-wise unsupervised learning introduced in [43].
• Type of Deep Learning Approaches: With the advancement of novel optimization algorithms, new milestones were realized in deep learning, giving rise to the following three categories of deep learning training: DSL, DSSL, and DUL. (a) DSL: DSL (or supervised deep networks), so-called discriminative deep networks, are designed to provide higher discriminative power for classification purposes usually via characterizing the posterior distributions of classes trained on the visible data. Target label data are continually available in direct or indirect forms for such supervised learning. Hence, making DSL models typically exceedingly efficient to train and test, more flexible to design, and more fitting for end-to-end learning of complex systems. DSL can be classified into the following different types: CNNs and RNNs.
-CNNs: CNN is one of the most common DNN architecture usually trained by a gradient-based optimization algorithm. Generally, CNN includes multiple back-to-back layers combined in a feed-forward fashion. Few previous works that use CNN as classifiers for ECG biometric authentication [339] have been proposed. In [339], a multimodal biometric system, combining CNN and Q-Gaussian multi-SVM that relies on distinct fusion levels, has been introduced for authentication. They obtained an EER = 3.2% using PTB database and EER = 2.9% for CYBHi database. The authors in [391] also generated the feature template employing the CNN classifier, which they preserved utilizing the matrix operation method. Finally, they introduced QG-MSVM classifier for authentication. They realized an EER of 3.5% with the PTB database. In [119], an authentication system, employing a combination of manual features and CNN based on ECG, is introduced. They employed scanning and eliminating techniques for feature extraction and CNN for classification. They realized an EER of 4.47% and 1.63%, typically by applying CYBHi and PTB databases, respectively. In [108], deep CNN on a PTB database containing 109 subjects and all 64 ECG channels is proposed, where data augmentation techniques are explored for the training. They realized an EER of 0.19% using the Physionet database. In [5], a Deep-ECG, a CNN-based biometric approach for ECG signals, is introduced for biometric recognition, where they used stochastic gradient descent with momentum for training. Their algorithm achieved 2.90% EER using the PTB database for authentication. Only a few studies have used deep-CNN strategies for ECG analysis [286], [338], but focus on the classification of heartbeats in healthy and non-healthy other than for biometric recognition. The work in [340] proposed the application of CNNs to develop human recognition system employing ECG. They developed the ECG features from 1-D CNN and 2-D CNN utilizing two strategies: the raw ECG signal strategy and the heartbeat spectrogram representation strategy, and then applied a score level fusion for three fusion strategies at score level: sum rule, mean rule, and multiplication rule. They realized for 1-D CNN an EER of 15.60% for 2-D CNN and EER of 20.48%, and the fusion of two CNN models and EER of 13.93%. The work in [401] proposed the application of CNNs to develop human recognition system employing ECG. They realized for the 1-D CNN an EER of 1.53% using the PTB database and an EER of 0.27% using the CYBHi database. The work in [11] developed a 1-D CNN-based biometric recognition system for human authentication and subsequently evaluated their method employing eight datasets from PhysioNet (CEBSDB, WECG, FANTASIA, NSRDB, STDB, MITDB, AFDB, and VFDB), where they realized good performance. They realized for the 1-D CNN an average identification rate of 93.5%, evaluated on eight ECG datasets. Few studies have combined CNNs with other methods. Considering that CNN is less sensitive to noise, in [341], an ECGbased biometric authentication system, which incorporates the generalized S-transformation and CNN techniques, is proposed to achieve an improvement in the accuracy of classification. They achieved identification rates of 99%, 98%, and 99% using the ECG-ID database, Physionet database Atrial fibrillation (AF) ECG signals, and Physionet database noisy ECG signals, respectively. -RNNs: RNN is a category of ANNs where connections between nodes form a directed graph along a temporal sequence. The use of advanced architectures of RNNs, such as LSTM and GRUs for learning long dependencies, has led to significant improvements in various tasks, such as in the application of ECG-based biometric recognition. The principal notion behind these networks is to employ many gates to control the information flow from previous steps to the current steps. By using gates, any recurrent unit can learn a mapping from one point to another. LSTM is used widely in time series signal analysis, such as the classification of ECG signals. In [342], a bidirectional LSTM-based deep RNN using late-fusion to construct a real-time system is proposed for ECG-based biometrics identification and classification. Here, two public datasets were used to evaluate the proposed model: MIT-BIH Normal Sinus Rhythm (NSRDB) and MIT-BIH Arrhythmia (MITDB), where they achieved an overall high classification accuracy. The proposed LSTM-based deep RNN model achieved an overall precision of 100% (and 99.8%), recall of 100%, an accuracy of 100% (and 99.8%), and F1-score of 1 (and 0.99) using MIT-BIH normal sinus rhythm database (and MIT-BIH Arrhythmia VOLUME 9, 2021 database), respectively. In [69], LSTM demonstrated to be more appropriate than GRUs for identification and classification in ECG biometrics. They achieved nearly 100% classification accuracy for the identification problem using the ECG-ID dataset and observed similar results for the MITDB dataset. (b) DUL: Extensively, from the viewpoint of generative learning, DUL (or unsupervised deep networks) are pre-trained using generating models, such as RBMs, and can be subsequently fine-tuned using standard supervised learning algorithms. Then subsequently applied to the test data set for classifications. Specifically, DUL methods operate without labeled classes capturing high-order correlation of data. The following are the most common architectures of DUL models: autoencoder-based methods, DBNs, and deep Boltzmann machines (DBMs).
-Autoencoder-Based Methods: Autoencoders (AE) are neural networks that can learn complex representations of the data and are used to automatically extract and select features for classification in an unsupervised fashion from ECG data annotated with beat locations. Specifically, AE has been used to learn lower-dimensional representations of the original data and pre-train other deep learning networks, e.g., CNNs. In [1], an autoencoder has been used to pre-train a DNN for the active classification of ECG signals for biometric recognition. In [343], an ECG-based biometric identification system that uses a deep autoencoder for feature learning to classify ECG signals is proposed. -DBNs: A DBN is a multi-layer generative graphical model. Generally, the DBN architecture is composed of stacked RBMs. RBM is a Markov random field model, which constitutes a visible layer corresponding to the input layer and a hidden layer corresponding to the latent feature representation. As the connections between nodes are bidirectional given an input vector, DBN can extract low-dimensional latent features and select critical channels to classify affective states using ECG signals. Besides, each layer requires unsupervised training and subsequently fine-tuned by adding a linear classifier to the top layer of the DBN and performing a supervised optimization.
The seminal work of [336] introduced an RBM, an undirected model for binary random variables to model distributions over binary-valued data. Each RBM includes a visible unit layer representing the data and a layer of hidden units that learn to represent features and capture higher-order correlations. Moreover, the seminal work of [337] introduced the DBN, which models evolving random variables over time and is composed of multiple RBM layers. Here, in each layer, each RBM receives the previous layer inputs and feeds the RBM in the next layer. Hence, training DBNs involves training RBMs, layer by layer, from bottom to up. In [19], RBM combined with DBN for single-lead ECG classification following the detection of ventricular and supraventricular heartbeats using single-lead ECG is proposed. Their results show that RBM and DBN can achieve high average recognition accuracies 93.63% and 95.57%, respectively, at a low sampling rate of 114 Hz using the MIT-BIH database. -DBMs: Another pre-training method, namely DBM, has also been presented in [344], where a stack of slightly modified RBMs is used to initialize the weights of a DBM. Results in [344] show that DBM learns good generative models and performs well in recognition tasks. (c) DSSL: Complementing unsupervised learning (with un-labeled data) with supervised learning (with labeled data) is referred to as DSSL (or hybrid deep networks). That is, DSSL algorithms use both generative (without labeled data) and discriminative (with labeled data) model components. One example of the DSSL is the GANs -GANs: To learn ECG signals for classification, we can leverage GANs to learn deep representations without extensively annotated training data [345]. GANs are a class of unsupervised machine learning algorithms. A GAN has two components, a generator, and a discriminator, that compete against each other during training. While the generators generate samples with approximate real data distribution via random data, the discriminators require to discriminate between true samples and false samples [345]. Game training is adopted to optimize model weight parameters between the generator and discriminator networks to enhance the model generalization capability. In GAN, the generator maps samples from an arbitrary latent distribution to ambient data space, while the adversarial discriminator aims to discriminate between real and generated samples [345]. Besides, the adversarial training method is applied to optimize both modules. In [345], the use of GANs has been investigated for ECG signal analysis, which can then be utilized as additional training data to enhance the classifier performance, where empirical results show that the generated signals significantly improve ECG classification. A typical characteristic of all the introduced deep learning methods is their abilities in conserving temporal variation of the ECG signal, which is considered a necessity for ECG classification [20]. This requires the ability to learn both short-term and long-term learning for efficient classification. Generally, deep learning method processing complexity relies on the number of required floating-point operations for processing the model [20]. That is, there is a strong relationship linking floating-point operations of a CNN model and the model inference time furnished as R 2 = 0.8888, p − value < 0.0015 and the model energy consumption given as R 2 = 0.9641, p−value < 0.0001 [20]. Shortcomings of the Deep Earning Methods: Notwithstanding the benefit of deep learning methods in enhancing the classification performance compared to conventional machine learning methods, they have the following shortcomings: (i) Generally, it is common knowledge that too little training data bring about overfitting problem in deep learning methods as the model profoundly consider training data and do not generalize adequately for the test data. Therefore, shallow methods render more reliable performance on a small number of data samples. (ii) Most modern deep learning models are inclined to learn the ECG signal noise characteristics giving rise to incorrect results. This challenge is noticeable with the dataset size.

3) kNN CLASSIFICATION MODELS
One generally utilized algorithm for feature extraction and classification of ECG-based biometric recognition and authentication is the use of the k-nearest neighbor (kNN) rule [346]. The fundamental concept of kNN algorithms for classification relies on the principle that the instances within a dataset will usually remain in close proximity to other instances that have comparable characteristics [332]. For example, if the instances are classified/tagged with a classification label, then the value of the label of an unclassified instance is resolved by considering the class of its nearest neighbors [332]. That is, kNN finds the k nearest instances to the instance (query) and determines its class by observing the single most common class label. The potential of the kNN classification model has been shown in an ECG-based biometric recognition system, yet there are some limitations regarding its practicality, which includes [346], [347]: (i) they possess high storage requirements, (ii) the classification is sensitive to the local distribution of the training samples, which may potentially result in the instability of performance (iii) the classification lacks a reputable means to choose k, besides by cross-validation or a related computationally-expensive method.
Notably, choosing a proper value of k is crucial as it influences the performance of the classification task of the k-NN algorithm. For example, consider the following rationale why a kNN classifier may wrongly classify a training sample: 1) The kNN is an instance-based machine learning algorithm as the learning process immediately applies the available training set, making the classification sensitive to the local distribution of the training samples leading to instability of performance [347], [348], and 2) for the basic kNN algorithm, the data cluster density influences its performance, which results in wrong decision-making [347], [348]. In [349], a kNN linear SVM and neural network were used as the classifier model for ECG-based human recognition on MIT-BIH and ECG-ID database. Their results achieved an EER of 3.05% using the ECG-ID database. In [350] the authors reported an identification accuracy of 99.68% employing a probabilistic neural network (PNN) and kNN classifier using MP-based indices for an ECG-based biometric system utilizing data acquired from ECG signals of 90 participants from the ECG-ID database. The authors in [351] developed an automated expert human identification system using ECG signals of 90 subjects selected from the ECG-ID database available at Physionet and applied kNN classifier to identify individuals using a 5-fold cross-validation scheme achieving the highest average rate accuracy of 97.62 ± 1.9. The authors in [352] use a discrete wavelet transform to extract wavelet coefficients as the feature vector while employing KNN as the classifier for ECG-based biometric identification. They achieved an identification rate of 93.1%, 99.4%, and 82.3%, using the MIT-BIH/Arrhythmia, MIT-BIH/Normal Sinus Rhythm, and ECG-ID databases, respectively. In [353], an average accuracy rate of more than 98% was obtained for all classifiers performing biometric identification in a mobile environment using ECG signal across four commonly used classification algorithms, namely Bayes network (BN), naive Bayes (NB), MLP, and kNN employing the MIT-BIH normal sinus rhythm database.

4) LDA BASED CLASSIFIER
LDA is also one of the most prevalent classifiers in ECG-based biometric classification based on its high recognition performance. The LDA-based classifier assumes that the conditional probability density functions (PDF) of multiple classes are normally distributed with equal class covariance. Notably, the Bayes rule turns out to be an LDA [287]- [289]. That is, the LDA classifier stemmed from Bayes' principles [287], which requires the computation of the posterior probability (P(i|x)) for the event belonging to class i, given an observation x as where P(x|i) is the likelihood probability of the observation x belonging to class i, P(i) is the prior 6 probability of any sample being class i, and ∀j P(x|j)P(j) is the is the probability of the observation occurring irrespective of class. Subsequently, Bayes' theorem is adopted for classification by designating the unknown observation to the class with the highest posterior probability as P(x|i)P(i) > P(x|j)P(j), ∀j = i. 6 Conventionally, the LDA classifier prior probability is naïvely set to be the number of the ith class i sample divided by the total number of all class samples. VOLUME 9, 2021 Instead of explicitly determining P(x|i), Gaussian probability distribution functions is generally assumed in the derivation of LDA to simplify the training to the corresponding parameter estimation (N (µ, )) [287].
A closely linked method to LDA, the so-called Fisher's LDA, seldomly surfaces in ECG biometrics literature [324], [354]. Fisher's LDA differs from the conventional LDA in that it does not make some of LDA's underlying assumptions like the normally distributed classes and equal covariance. The authors in [355] use the LDA classifier to study the viability of ECG-based biometric systems for human identification employing a publicly available database of five subjects. PCA algorithm and LDA were applied for feature transformation and classification, respectively, which results in a 0% minimum average recognition error. In [356], an LDA was used as a classification algorithm for biometric authentication using ECG signals with the spectral power, maximum power, and maximum power frequency in the alpha band as features. Two data sessions from four subjects were acquired utilizing only one bipolar channel (O1A2). Here, the time interval between the sessions ranged from 10 days to 5 months, where the resulting classifiers produced an authentication performance of 98.33% with test recordings of 20 seconds duration. The authors in [34] demonstrates the feasibility of LDA classification for human identification using ECG signals, where 29 subjects (ranged in age between 22 and 48) were tested with 12 repeat sessions for a total of 41 sessions within the data set. Each individual session contained a set of seven two-minute tasks and results show LDA improves the classification model's performance.

5) ECG SIGNAL CLASSIFICATION BASED ON KERNEL METHODS
Kernel methods are effective in ECG signal Classification, whose best-known member is the SVM. SVMs were originally formulated for classification for binary (two-class) problems. This classification method extends to consider continuous outcomes and classification with more than two classes. Notably, there are two types of methods for multiclass SVM. One is by constructing and combining different binary classifiers, while the other is by straightforwardly considering all data in one optimization formulation. SVM is a supervised learning algorithm introduced in [357] for classification. Given a set of training data in a two-class learning task, an SVM training algorithm constructs a classification function that designates new observations to one of the two classes on either side of a hyperplane, making it a nonprobabilistic binary linear classifier [357]. In [358], SVM was applied for ECG signal classification utilizing a fusion of three proposed types of characteristics: cepstral coefficients, zero-crossing rate, and entropy and achieve high human identification accuracy of 100% on ECG signals that are from the MIT-BIH database, ECG-ID (Five recordings), and ECG-ID (Two recordings). In [346], physiological information present in ECG intervals and amplitudes were proposed for ECG signal classification using ANN, kNN, and SVM classifiers on two ECG databases, namely MIT-BIH Arrhythmia and ECG-ID databases. The results show that the SVM classifier outperforms with a 93.709% overall classification accuracy. In [359], 6 classifiers which include ANNs, decision trees (DTs), Fisher linear discriminant analysis (FLDA), kNNs, NB, and SVMs, were utilized for ECG-based biometric identification for human authentication. They employed 1800 ECG signals acquired from 36 subjects using the MIT-BIH database and obtained the highest accuracy rate of 95.46% in the case of the SVM classifier.

6) DECISION TREE (DT) CLASSIFICATION MODELS
A decision tree is a classifier that conducts recursive partition over the instance space and is composed of internal nodes, edges, and leaf nodes [360]. Each internal node, the so-called decision node, splits the instance space into two or more sub-spaces based on certain functions of the input attribute values [360]. Each leaf is allocated to one class that describes the most proper or frequent target value. Based on the test node path outcomes, instances are classified typically by crossing the tree from the root node down to a leaf. Moreover, each path can subsequently be transformed into a rule over joining the tests along the path.
In [361], a scheme that considers the ECG signal as a continuous data stream is introduced, where the user is authenticated every period of time for continuous authentication. The samples are classified using the decision tree, SVM, k-NN algorithms, where they achieved accuracy up to 96%, with almost perfect system performance (kappa statistic >80%). In [362], a methodology is presented for an ECG-based biometric authentication system using raw ECG signals through EMD. The feature extraction procedure combines five features from statistical, time, and frequency domains, categorized via the decision tree, SVM, and k-NN classification methods. The 10-fold cross-validation-based classification evaluation shows that the decision tree realizes an accuracy of 96.38%, the sensitivity of 98.2, specificity of 99.67%, and 3.62% error for successful classification of 14 subjects. In [363], several features such as Wavelet transformation, temporal Analysis, QRS-complex detection, and power spectral density estimation were used for ECG-based biometric authentication while using BN and decision tree for the classification. The methods were tested on a dataset that contains 18 healthy subjects, 5 men (aged 26 to 45) and 13 women (aged 20 to 50), and achieved better performance for all classifiers in each recognition problem (with a best recognition F score of 0.972). Table 8 summarizes the major findings and results of ECG classification algorithms.

VI. ECG SIGNAL APPLICATION DOMAINS
Some of the application domains of ECG include the following.

A. ECG-BASED BIOMETRIC IDENTIFICATION AND AUTHENTICATION
Over recent decades, the potential of ECG signal, which records the electrical depolarization-repolarization patterns of the heart, as a biometric modality for human authentication, has grown in the field of pattern recognition. This is mainly due to the inherent nature of the ECG signal as characterized by its universality, inherently hidden nature, inherent aliveness detection, and continuous availability [364]. Given the essential and continuous nature of this information source, the ECG signal is considered highly confidential to the user, and it offers strong protection against spoof attacks, unlike conventional biometrics systems. ECG attributes or features are designed to classify the subject exploiting inter-subject variability. Specifically, features rely on the heartbeat morphology on specific time intervals acquired from ECG waves (or on specifically extracted features). Here, the choice of the employed features is inspired by the hardware complexity of each functional component, the requirement of real-time identification, and the specific recording equipment. Several works have been proposed on ECG biometric systems utilizing a fiducial based and non-fiducial based approach −based on the need to identify precise points in ECG signals. Due to a lack of liveness check, biometric systems such as fingerprints, palm print, face recognition, iris recognition, or speech recognition for human identification encounter severe challenges caused by data replication and malicious forgery.
Furthermore, ECG-based biometric systems can be employed as an extra authentication factor to enhance security. For instance, biosafety Laboratories, data centers, banks, power plants, clean rooms, or hospitals. However, ECG-based biometric systems still lack extensive investigation in real-life scenarios. Most of the existing works are based on either controlled laboratory investigations with medical-grade ECG trackers or depend on data from medical Diagnostic ECG databases, for example, PhysioNet PTB [365], MIT-BIH [366], UofTDB [103], AHA [367]. However, medical setups use multiple leads to measure the ECG signal with adhesive on the body of the electrode. While medical setups use multiple leads to measure the ECG signal with adhesive on the body of the electrode, they achieve a signal with low noise and almost no motion artifacts. Thus, considered unrealistic for practical applications. While recent research direction is motivated on developing ECG biometric systems measurable by wearable devices, the quality of the acquired data will be worse in comparison to ambulatory ECG. Specifically, studies in [368]- [371] employed wearable devices for ECG biometric authentication, but the participants involved in the investigations were very few. Moreover, studies in [287], [291], [372]- [375] employed the use of affordable non-medical grade tracking devices to acquire ECG data at the index fingers.

B. ECG SIGNAL IN MULTIMODAL BIOMETRIC SYSTEMS
While unimodal biometric systems use a single biometric trait of the individual for identification and verification, multimodal biometric systems are capable of managing a combination of two or more biometric modalities for improved recognition rate and spoof attacks [283], [339], [339], [376]- [379]. In the following, we will discuss the advantages of multimodal biometric systems and explore the different types of fusion techniques in multimodal biometric systems.

1) ADVANTAGES OF ECG SIGNAL IN MULTIMODAL BIOMETRIC SYSTEMS
The obvious benefits of ECG signals in multimodal biometric systems are increased recognition performance, enhanced security, and fewer enrolment problems [376], [380]. We describe some of the benefits in details in the following: (a) Reliability: During ECG biometric data acquisition, due to defective equipment, improper sensors, or environmental factors can introduce noise. However, the fusion of multiple biometrics modalities in the so-called multimodal biometric systems provides additional information for reliable human identification. (b) Universality: Universality as a main characteristic of a biometric modality, may not be truly universal [339], [378]. For example, the national institute of standards and technology finds that as a result of disabilities, cuts, and bruises, approximately 2% of the world population may not entirely be fingerprinted [381]. Hence, ECG in multimodal biometric systems has been used to overcome the limitations of unimodal biometrics and provide high accuracy recognition. (c) Uniqueness: The uniqueness (related to inter-subject variability) of several biometric traits of different subjects (i.e., identical twins of the same family) can seldom be quite alike and lead to high false recognition rates. However, multimodal biometric systems may render supplementary information to overcome this inherent limitation. (d) Security: While unimodal biometrics are extremely difficult to be stolen, it is yet feasible to bypass a biometric system employing spoofed traits. Hence, the ECG signals in a multimodal biometric system realize the fusion of decisions exerted beneath distinct traits to increase the security of biometric authentication [339], [378], [379]. That is, if one of the modalities is compromised, the system can still guarantee security employing the remaining biometric modalities since it will be difficult for the intruder to spoof multiple biometric traits. Therefore, the use of ECG signals in multimodal biometric systems provides improved performance over unimodal biometrics in authenticating human subjects under multiple limiting factors and spoofing attacks.

2) INTEGRATION MODES
Generally, multimodal biometric systems can be realized in one of several diverse modes: sequential mode, serial mode, parallel mode, hierarchical mode, or pipelining mode.
(a) Parallel Mode: The parallel fusion mode has been studied more extensively than the serial fusion mode. The parallel fusion mode uses information from multiple modalities simultaneously, and thus the time consumption of employing all individual biometric recognition systems should be the same else one system has to wait. Accordingly, this approach is best applied when all individual biometric recognition systems are computationally fast [377]. Parallel mode is beneficial when all individual biometric recognition systems possess different confusion matrices, but at a drawback of separate sensors required for each modality, even if different modalities can use the same sensor equipment [377]. A multimodal biometric system based on different fusion levels of ECG and fingerprint using different classifiers is proposed in [378], by using 47 human subjects from the MIT-BIH database. Results indicate (area under the ROC curve) up to 0.985 for sequential multimodal system, and up to 0.956 for parallel multimodal system, compared to the unimodal systems that achieved AUC up to 0.951, and 0.866, for the ECG and fingerprint biometrics, respectively. (b) Serial Mode: Several schemes have been introduced which investigated how to use biometric traits sequentially for recognition. In multimodal biometric systems in serial mode, each modality in the multimodal system is examined one after another. Therefore, the multimodal biometric system may give the final recognition decision before acquiring all the modalities, based on the setting. That is, the final recognition decision yielding acceptance depends on any biometric modalities that turn true, otherwise rejected as shown in Fig. 7. From Fig. 7, it is evident that an early decision can reduce the overall recognition time in this realization. Consequently, multiple modalities do not have to be acquired simultaneously. Moreover, a single sensor can be utilized, for example, for the serial acquisition of both the user's face and iris. Research has shown that the multimodal biometric systems that operate at the serial modes are found to be more robust than the parallel ones [283], [339], [339], [376]- [379]. (c) Sequential Mode: In the sequential mode, the multimodal biometric systems are sequentially combined, with each biometric system having the option to reject [377]. Typically, once any system perceives a  quality is not satisfying, then the modality is rejected, and the decision relies on the second biometric system, and so on [377]. Besides, the second system is typically more costly, informative, and computationally complex than the first. However, the classification time is much less in the multimodal biometric system strategy at the reasonable expense of classification accuracy [377]. For example, in a sequential mode multimodal biometric system that combines ECG signal and fingerprint modalities, the system must start with the ECG authentication to ensure liveness detection [339]. Since ECG biometric renders inherent liveness detection, which has a computational advantage, fingerprint authentication is superior at accepting genuine users. Subsequently, after the multimodal biometric system has rejected the impostors and accepted the genuine users, the remaining subjects would be authenticated using the fusion of ECG and fingerprint, as illustrated in Fig. 8. Specifically, in Fig. 8, the authentication first ensures liveness detection using ECG, with either a reject (i.e., the overall system final decision) or an accept decision to acquire fingerprint modality for authentication in the next stage. The final decision of authentication for a user occurs at this stage with an acceptance decision [339]. Conversely, both the ECG and fingerprint modalities of the remaining rejected users are combined at the decision level to make the final decision for those users [339], [377]. (d) Hierarchical Mode: The multimodal biometric systems realized using the hierarchical mode of operation combines both the serial and parallel modes of operation [377]. The multimodal biometric system based on a hierarchical model of operation inherits the benefits of both parallel and serial modes of operation. This method is used mostly in the situation when an extensive number of biometric system modalities exists. In [40], a fiducial-detection-based framework that incorporates analytic and appearance attributes for human identification using ECG data, is presented, where nearest neighbor (NN) classifiers in combination with Euclidean distance are employed. Here, the hierarchical strategy splits the problem into two subproblems: 1) a first-level classification is initially employed based on analytic features alone (time + amplitude of fiducial points). 2) PCA based classification module is subsequently applied to classify subjects that can be conceivably confused by the initial stage. The introduced hierarchical mode method realizes a subject recognition rate of 100% for both datasets and ECG recognition accuracy of 98.90% for PTB and 99.43% for MIT-BIH. (e) Pipelining Mode: The biometric system operating in the pipelining mode takes benefit of the multimodal systems by employing a single sensor and single feature extraction scheme. While extracting features of the first modality, features of the second modality are simultaneously acquired at the same time [377].

3) INFORMATION FUSION TECHNIQUES IN MULTIMODAL BIOMETRIC SYSTEMS
The fusion strategy of the information of multimodal Biometric modalities is achieved in several methods, which are categorized based on parameters like fusion scheme, information sources, and fusion levels [382]. VOLUME 9, 2021 (a) Fusion Scheme: Sequential and parallel fusion are the two different types of topological multimodal biometric fusion techniques [382]. While multimodal biometric modalities are processed simultaneously in parallel fusion techniques, they are processed in a sequential top-down merge technique until an acceptable match is obtained. (b) Information Sources [382]: The information for multimodal biometric modalities can be achieved by • Multiple Traits: Different biometric modalities are fussed for human identification.
• Multiple Sensors: Different sensors equipment are adopted for capturing a single biometric modality for human identification.
• Multiple Sample: Here, multiple representations or samples of a single biometric modality are captured by a single sensor equipment.
• Multiple Algorithm: The multimodal biometric system uses different matching algorithms of a single biometric modality.
• Multiple Instances: The multimodal biometric system utilizes multiple instances of a single biometric modality. (c) Fusion Levels: While fusion can be achieved at diverse levels in multimodal biometric systems [383], [384], such as data-level, feature-level, score-level, rank-level, and decision-level, the fusion at the match score level has been widely investigated in the literature. We briefly highlight some of the commonly used fusion levels with more emphasis on the score level: • Decision-level: Here, each biometric trait in the multimodal biometric system offers a decision independent of the other biometric traits, which are subsequently fused to form a single decision [384].
• Feature-level [385]: Here, the feature sets arising from multiple biometric sources are fused into a single feature set via proper feature normalization, transformation, and reduction techniques. However, feature-level fusion, the feature sets arising from multiple biometric sources may not be readily fused [382], i.e., one-dimensional signals from an ECG and two-dimensional images from a camera.
• Score-level [384]: Match score-level fusion comprises fusing the match scores generated by multiple classifiers (or matchers) to furnish a decision regarding the identity of the subject. There are several different schemes for realizing score level fusion premised on different models, such as density-based fusion schemes, classifier-based score fusion schemes, and transformation-based score fusion schemes [384]. Score-level fusion is the preferable strategy for multimodal biometric modalities since classifier scores are straightforwardly obtained and processed to be fused [283]. Typically, score-level fusion methodology realizations entail resolving how much impact each classifier in the multimodal biometric system has over the final class output. While match score fusion has been shown to be effective [384], its matching performance is compromised under diverse conditions: -Density-Based Score Fusion [383]: The densitybased score fusion scheme models relies on estimating density functions for the genuine and impostor match score distributions. While the density-based score fusion scheme employs the likelihood ratio test to express the fusion rule, it can be influenced by the application of inaccurate density functions for the genuine and impostor scores. Density estimation may be categorized into parametric and non-parametric approaches. In the parametric density estimation method, the most outstanding method is the maximization of the likelihood of the samples of the given parameters via an expectationmaximization (EM) algorithm. Generally, density estimation parametric methods rely on incorrect model (Gaussian densities) assumption and can give rise to sub-optimal fusion rules. Other drawbacks include: 1) the number of components in the mixture has to be chosen, ii) singularities may occur, and 3) the resulting densities are prone to overfit the data. Non-parametric density estimation assumes that the data follow a given unknown probability law that has a density function and builds an estimation of the assumed density function. The most uncomplicated non-parametric method is kernel density estimation (i.e., mixture densities with components centered about data points are used). Non-parametric density estimation method is affected by the availability of a limited number of training samples (notably, genuine scores), thus influencing the likelihood of devising an efficient fusion rule. Other principal drawbacks to this approach include [386] 1) the determination of component parameters, which impact the shape and smoothness of the densities, and 2) the complex function representations involving the whole data set. -Classifier-Based Score Fusion: In classifier-based fusion schemes, the model is a classifier. Hence, notable statistical and classifiers approaches have been employed, like the neural-networkbased classifiers, fuzzy clustering, kNN classifier, (classical) k-means clustering, SVM, and the Bayesian classifier. -Transformation-Based Score Fusion: In transformation-based fusion schemes, the model relies on estimating normalization functions. In Table 9, we compared the authentication performance of some state-of-the-art multimodal authentication algorithms.

4) COMPARISON BETWEEN UNIMODAL AND MULTIMODAL BIOMETRIC RECOGNITION SYSTEMS
• Accuracy: While the multimodal biometric systems use multiple biometric modalities to identify a person, the unimodal biometric systems use a single biometric modality. Hence, the multimodal biometric systems ensure higher identification accuracy [395].
• Security: Multimodal biometric systems provide anti-spoofing measures since they acquire multiple biometric modalities, unlike unimodal biometric system. Thus, an intruder finds it difficult to spoof simultaneously multiple biometric modalities of a legitimate user [395].
• Universality [395]: Multimodal biometric systems tackle the non-universality problem as acquiring multiple biometric modalities can render better coverage of the population unlike the unimodal biometric system.
• Liveness Detection: Unlike unimodal biometric systems, a multimodal biometric system that tackles the non-universality problem takes other forms of biometric to authenticate a person [396], even if a person is incapable to render a form of biometric owing to illness or disability.
• Cost-Effective: Unlike unimodal biometric systems, multimodal biometric systems are cost-effective since they provide greater security levels to reduce the risk of criminal attack [397].

VII. ECG SIGNAL EVALUATION METRICS
Typically, for classification tasks, a confusion matrix is determined from the classification results, and most of the evaluation metrics are variants of the data that the matrix stores. Specifically, we present an illustration confusion matrix in classification problems with two classes in Table 10. Notably, problem transformation techniques reduce the multi-class problem into multiple two-class problems. From Table 10, we observe that there are possibly four different results predictions. The actual positive and negative samples are correct classifications, while the false positive and false negative outcomes are two possible types of errors. Let us denote by a and d the number of correct predictions that instances are negative and positive, respectively, and by b and c the number of incorrect predictions that instances are positive and negative, respectively. In the following, we present the several success measures employed in evaluating ECG signal analysis and classification assignments.
(a) Accuracy: We can define the accuracy measure as the ratio of correctly classified samples to the total number VOLUME 9, 2021 of classification samples as [398], [399] Accuracy = a + d a + b + c + d .
Notably, classification accuracy is frequently used in ECG-based investigations [399] as a metric for performance evaluation. The foremost drawbacks of accuracy as a measure for evaluation include: (1) it ignores the distinctions between the error types; (2) it is reliant on the distribution of the dataset class. (b) Precision: The precision or the so-called positive predictivity value is a performance metric that measures the number of the positively predicted samples that are important and is determined as (c) Sensitivity: The sensitivity metric is a measure of actual positive samples that are properly identified as positive and is calculated as [399] Sensitivity = d c + d .
Several ECG-based biometric recognition systems prefer the use of the sensitivity measure as the evaluation metric [399]. (d) Specificity: The specificity metric is a measure of actual negative samples that are accurately identified as negative and is calculated as [399] Specificity = a a + b .
Several ECG-based biometric recognition systems also prefer the use of the specificity measure as the evaluation metric [399]. (e) F-Measure: The F-measure is the harmonic mean of precision and sensitivity and employed as a single measure to characterize the overall performance [398]. (f) Receiver Operating Characteristic: The Receiveroperating characteristic (ROC) curve is a measure that shows the connection between sensitivity and specificity for every probable cut-off for a test or a combination of tests. (g) Matthews Correlation Coefficient: The Matthews correlation coefficient (MCC) is adopted to measure the correlation between the actual classes and predicted classes. It considers all the true and false values, which is why it is viewed generally as a correlation measure employed even if there are diverse classes. The MCC can be calculated as The following are some remarks about the MCC metric [400]: (1) The MCC can be determined employing the confusion matrix, (2) the determination of the MCC metric applies the four measures (a, b, c, and d), which yields a better result of the performance of classification algorithms, (3) if any of the measure (d + b), (d + c), (a + b), or (a + c) is zero the MCC is not defined, and (4) the MCC takes values within the interval [−1, 1], with 1 furnishing an absolute agreement, −1 an absolute disagreement, and 0 show that the prediction was uncorrelated with the ground truth.

VIII. CONCLUSION
Typically, one heartbeat of ECG signals consists of several waveforms, which generally contain P waves, QRS complex, and T waves. Nonetheless, ECG signals are influenced easily by diverse factors. Biometric recognition systems are the systems that have been developed persistently for enhancing security levels and well-being in the working environment. The development is made feasible entirely with pattern recognition, machine learning, and deep learning methodologies.
In this paper, we presented a comprehensive survey on ECG signals as a new biometric modality for human authentication, employing several topics such as ECG preprocessing, feature extraction, feature selection, feature transformation, feature classification, databases, and performance measures for evaluating the accuracy of the ECG classifier. Specifically, in the feature extraction section, two fundamental methods, namely fiducial and non-fiducial techniques, have been studied for ECG-based biometric recognition. Significantly high performance has been realized by employing fiducial methods of feature extraction with small databases. Contrarily, the non-fiducial methods give relatively high efficiency for an extensive population and do not require the finding of fiducial points enclosed by the ECG signal. Thus, exploiting the non-fiducial methods for feature extraction within ECG signal is advantageous to realize a high-performance ECGbased biometric recognition system. In the feature classification section, we revised existing DNN approaches employed for the ECG biometric recognition system from the viewpoints of models, databases, and tasks while emphasizing the recent research advances, unresolved challenges, and research opportunities. We observed that deep learning classification methods can potentially realize better performance than conventional methods for ECG-based biometric systems. However, if we are exclusively concerned about the best possible classification accuracy, it might be challenging to realize a single classifier that performs similarly to an ensemble of classifiers. Notwithstanding the benefits, ensemble techniques have at least three potential drawbacks. That is, (1) the increase in storage to store all component classifiers after training, (ii) the increase in computation to process all component classifiers, (iii) the decreased comprehensibility for non-expert users in the decision-making of multiple classifiers. In the database section, we introduced most of the databases on ECG signals presenting succinct knowledge regarding them. Notably, the inadequate number of publicly available databases constitutes one of the principal challenges in the ECG study. Thus, researchers are encouraged to publish their ECG databases using their local ECG repository. Moreover, we addressed the classification of ECG-based biometric recognition systems, viz unimodal and multimodal. Premised on the drawbacks of ECG-based unimodal biometric recognition systems as presented, the ECG-based multimodal biometric recognition system has been introduced mainly as a desired means to solving the diverse challenges. The various methods, fusion levels, and integration strategies used to combine the information in multimodal biometric systems were also presented. While more research is needed to find solutions to the shortcomings of the different fusion methods, the multimodal biometric systems can be crucial for two-factor authentication for smart card tokens and mobile applications requiring security for transactions. Therefore, susceptibility to spoofing attacks should remain a key concern in all introduced ECG-based biometric recognition systems and algorithms. Future work would discuss how the ECG can secure the IoT technology since ECG is fast becoming a key tool to authenticate the IoT devices. Thanks to IoT, several approaches have been proposed to apply user data and ECG signals for biometric authentication [401]- [403]. For example, the user ECG signal acquired utilizing wearable sensors for ECG-based authentication in IoT-based biometric systems in [401] is sent through Bluetooth of a device (e.g., Mobile) for authentication task to the cloud where trained deep authentication model resides. However, the use of template protection methods to increase system security against spoof attacks in [401] is missing and a possible research direction.