Adaptive Filtering Approach With Forgetting Factor for Stochastic Signals Applied to EEG

This paper presents a new stochastic adaptive estimation-identification technique for nonstationary systems. The proposed method enhances the initial results from an on average estimation, and its identification, through a generalized adaptable function based on the Exponential Forgetting Factor (EFF), and the Sliding Mode (SM) regarding the error identification. In this form, the presented process includes the function implementation in three stages–estimation, adaptive estimation, and adaptive estimation-identification, allowing us to observe the gradual convergence to a nonstationary reference signal. Simulations first introduce convergence level checks obtained from the estimation and identification of artificial signals. After that, the algorithm is applied for real references, considering the Electroencephalogram (EEG) signals taken from a public database, finding their internal nonstationary gains, indirectly. Finally, the results include a performance comparison between the proposed strategy concerning the Recursive Least Square (RLS), the Least Mean Square (LMS), and the Kalman Filter (KF).


I. INTRODUCTION
The processes or systems analyses require the creation of simplified representations for their study. There are two approaches to obtain them depending on the amount of accessible information, using the physical concepts or the mathematical images.
The signals are quantitative representations of measurable characteristics from phenomena or systems. The analysis of these signals is useful to define relationships between them, creating empirical models without requiring a priori knowledge, contrary to the models based on physical first principles. In this context, a Black Box (BB) is an example of models created from only their input and output measured information [1].
Within the filtering theory, the Parameter Estimation (PE) and the State Identification (SI), allow reconstructing the states of the BB [2].
The associate editor coordinating the review of this manuscript and approving it for publication was Md. Kamrul Hasan .
In this form, according to the new adaptive filtering approaches, the identification error adjusts the PE, and it adapts the SI to the reference without losing the time restrictions.

A. RELATED WORKS
Based on BB model responses having Linear and Time-Invariant (LTI) characteristics, it is common the use of standard estimation algorithms building through the Wiener theory, such as the methods Least Mean Square (LMS), the Recursive Least Square (RLS), and the Kalman Filter (KF). Without needing adaptive strategies [3].
Implementing their regular versions, in the case of the LMS, entails a O(N ) complexity, while for the RLS, converging faster than the LMS, it is O(N 2 ), being the Kalman, the most expensive with a O(N 3 ) complexity.
However, the signals studied by different science areas and industries are Nonlinear and Time-Variant (NLTV). They have changes in frequency and amplitude throughout their evolution, and additional exogenous influence, so that their descriptions require the implementation of settings for LTI traditional methods.
From that, the modeling process needs to include adaptive adjustments to track the signal variations, [4], [5], taking into account the advantage of their intrinsic properties applied to the identification [6]- [9].
Wiener approach applied to Nonlinear (NL) systems uses a nonlinear autoregressive (NAR) model, considering the Sliding-Window applied to the RLS method [10].
Due to this, the Exponential Forgetting Factor (EFF) introduces the estimation of the correct parameters from the identification results [11]. While considering that, it is better to use RLS with iterations for approximating the desired state in stochastic systems [12]. Unfortunately, an aggregated computational cost exists when setting initial conditions, and evaluating the candidate states of algorithms [13]- [15].
An adaptive KF is useful for multivariable systems with stochastic parameters [16]. However, even though the RLS operates matrix multiplications when applying the Forgetting Factor (FF), the KF algorithm cost is higher than the RLS for the same process.
The use of fractional calculus [17], [18], is another perspective for the analysis of stochastic signals, focusing on the Hammerstein instead of the Wiener's description, being this last the one of interest for this work.

B. INNOVATIVE CONTRIBUTION
Therefore, to provide a new filtering tool for stochastic signals, keeping a low complexity, we propose an adaptive estimation and identification process considering the EFF and the SM described in [11].
The objective is to improve not only the obtained on average parameters but also the complete estimationidentification process within a generalized correction function for characterizing a reference.
To accomplish the tracking of nonstationary signals, we take advantage of the EFF features, incorporating the sign function and the identification error as a part of the adaptive function arguments. With this form, the proposed method increases the convergence rate to the reference without establishing special conditions in each nonstationary evolution.
Regarding the reference signals, the EEG represents brain activity throughout electrical impulse measurements [19], [20]. They, as stochastic signals, have a Time-Variant (TV) behavior for each specific status [21], [22], and particular signal acquisition conditions.
Knowing the EEG signals behavior and attributes through representative estimated parameters contributes to the clinical scenario. It is helpful to obtain, for example, more accurate data for training algorithms to classify, recognize, or predict certain brain phenomena. This also concerns the Brain-Machine (BM) or Brain-Computer Interfaces (BCI) [23]- [26].
The novel-algorithm scope is the nonstationary filtering, whose aim is to emulate the phenomenon behavior without giving an interpretation of it. In this sense, it focuses on signal tracking and not on providing a diagnosis [27].

C. ORGANIZATION
The paper contains the subsequent sections. Section II explains the new filter and its characteristics. Section III shows the simulation results using the artificial and the real EEG signals available in [28] and includes a comparison between the proposed method and the LMS, RLS, and KF, measuring the convergence level and the execution times. Section IV introduces a discussion hinged on the developed filter and presents the conclusions.

II. ESTIMATION AND IDENTIFICATION PROPOSAL
In the case where the BB response is nonlinear, it is possible to propose a first-order ARMA(1,1) model with nonstationary properties.
In discrete form, for each sampled point (k ∈ Z + ) within the estimation-identification process, the relationship between the known bounded input x k ∈ R [−1,1] and the estimated parameters â k ∈ R [−1,1] gives the identification results ŷ k ∈ R which approximates to the reference signal (y k ), showing the identification error (ê k ∈ R [0,ε] , 0 < ε ≤ ∞) as a measure of the convergence.
From the model [11], the parameters â k calculated on average through the second moment are optimal in terms of probability, and using the mathematical expectation builds the recursive description (1), where (2) represents the recursive calculation (q k ∈ R), Then, the implementation of the parameters â k into the ARMA(1,1) model leads to the identification signal ŷ k as in (3),ŷ with The error (ê k ) viewed through the difference between the reference (y k ) and its identification (ŷ k ), has a normalized description.
However, the Wiener estimation applied to identification does not allow tracking a non-smooth reference. In agreement with [11], the filter form needs the Exponential Forgetting Factor (EFF) with Sliding Modes (SM), correcting its performance in each iteration. This consideration is in (4).
The relationship between the identification error (ê k ) and the Exponential Forgetting Factor function (EFF k ) is valid if it has the following condition EFF k = (−∞, ∞) , ∀ê k > 0 or (−0.37, 2.71) , ∀ê k ≤ 0 . And it is valid when the EFF k includes the SM for the estimation.
Moreover, while evaluating point-to-point for each sample k, the identification requires an additional correction function, so that (5) includes the estimation-identification process, modifying the original EFF k (4), obtaining not only corrected-parameters but also the final output identification process, The first argument in (5), β k , represents the value to correct, either the parameters or states, by using the inner product, the sign, and exponential functions as apart of (5). The second argument is the identification error (6), built by the difference between the reference and the identification, Within the advantages presented by (5), as an adaptive approach, are the exponential function following NLTV signals faster than with another polynomial type. Also, this new process contains the sign function assuring its directional sense, allowing the identification to keep in each step a better convergence rate to the reference signal.
In the filtering theory, a better convergence rate requires different adaptive strategies that affect the estimation and identification and the relation between them. In this form, function (5) is presenting a new approach considering the changes generated by the nonstationary signals.
The implementation of (5) allows three ways of use, depending on the adaptive requirements, either the parameters or states, seeking in each case a better convergence rate to the reference, as described below.

A. MODEL 'a': SETTING THE ESTIMATED PARAMETERS ON AVERAGE
This configuration consists of applying the on average parameter estimation from (1), in the linear identification function ŷ k viewed in (3), and its corresponding identification error ê k .
With these values, the adaptation adjusts the estimation results taking into account (5), where β k =â k and (ê k = y k −ŷ k ), giving F k â k ,ê k as in (7), In a symbolic form â ak := F k β k ,ê k for its implementation in (3), resulting in the second identification (8), and the new error (9),ê The model a representation is in the block diagram in Fig. 1.

B. MODEL 'b': SETTING THE IDENTIFICATION
Model 'b concerns the use ofâ k to determineŷ k , and the error e k , to then apply them into the function (5) with arguments β k =ŷ k andê k = y k −ŷ k , defining F k as a corrected identification, i.e., F k =ŷ bk as in (20), and its corresponding error (21), accomplishingê bk <ê k .
Now, having (20) in (21) is (22), Grouping terms gives (23), Besides, using the Taylor Series in (23), in its first two terms, yields to (24) Minimizing the expression (24) and developing operations results in (25) Instead of (25), by including the basic properties of the inner product is (26) Nevertheless, evaluating the Taylor Series in its first three terms gives (27), Grouping and reducing (27) then it is (28), Now, obtaining the absolute value ofê bk in (28), leads to (29) and the triangular inequality applied in (29) results in (30), Then by ê bk converging to ê k − ê k 2 /2, the adaptation achieves ê bk ≤ ê k in n m=2 ê k m /m! with m, n ∈ Z + and is equal whenê k is near to zero. The principal difference between models a and b is the fact that the second configuration does not correct the parameters; it only adjusts the identification, as shown in the block diagram in Fig. 2. The third model configuration considers three sections for the filtering process. In the first, it uses the on average estimation a k , affecting the identificationŷ k and obtaining the convergence error e k .
To summarize the variables mentioned in this section, Table 1 presents those most representative used in the models' descriptions and Fig. 1 to 3.

A. IDENTIFICATION OF A NONLINEAR TIME-VARYING SYNTHETICAL SIGNAL
To analyze the performance of each model described in the previous sections, we consider the stochastic synthetic NLTV signal represented through the interval function (42), which has variations in frequency and amplitude, with {ξ k } ⊆ N µ, σ 2 < ∞ , and k ∈ Z + ; as seen in Fig. 4.
The use of the synthetic signal with non-smooth variations seen in Fig. 4, seeks to observe how the identification adapts to the reference with the models 'a', 'b', and 'c' implementation. The observed results are in Fig. 5, where the label 'Average' represents the results from the on average approximation, without correction, and the rest ones, the three considered models.
In Fig. 5, the identification made without correction, with the parameters obtained on average and applied directly to the identification function, tracks the reference signal through the  center of the trajectory and ignores the changes in frequency or amplitude.
Instead of this behavior, the response obtained with the model a approximates the reference signal on the internal side, making the identification remain below the reference.
In models 'b and 'c results, the approximation is closer on the upper side, acquiring values that exceed the limits defined by the reference, but presenting different levels of convergence.
To obtain the functionals of error J i,k , i=1.5 the second probability moment is considered, in a recursive form and based on identification errors e i,k , i=1. 5 .
Each error is a function of the difference between the considered reference and its corresponding identification result. After identifying the signal from Fig. 4, on average, and with the three models, Fig. 6 shows the functionals of error.

B. EEG SIGNAL IDENTIFICATION FOR DIFFERENT PHYSIOLOGICAL AND PATHOLOGICAL CONDITIONS
The EEG signals have properties that make them a case of application of the estimation and identification of NLTV signals. This section presents the identification of these signals using the proposed model.
The collecting and describing brain signals processes for different physiological and pathological states are in [29]. Furthermore, due to our given objective of only characterize the EEG information, describing it equivalently through its parameters and internal states, public databases are used.
Derived from the work in [29], the signals used in this paper are available in the repository [28], grouped into five sets (A, B, C, D, and E) with 100 signals in each. An individual EEG represents 23.6 seconds of recording, with a sampling frequency of 173.61 Hz within a spectral bandwidth of the acquisition system of 0.5 to 85 Hz.
Sets A and B are signals taken for normal brain states. In contrast, sets C, D, and E belong to preoperative diagnostic state signals. Sets C and D represent convulsion-free interval measurements, while set E shows brain activity with convulsions.
For the identification test, the proposed references are a set of nonstationary signals with different behavior. For example, signal E has an amplitude of 1000 µV, while the rest have 100 µV. The chosen signals represent each set (A to E), as viewed in Fig. 7.
The three proposed configurations of estimation and identification and the on average approximation are applied to the reference signals to evaluate which method provides the best performance giving the results in Fig. 7 Regarding the model performances, Fig. 8 shows the functionals of error obtained from the identifications of the complete reference signals. In this case, these figures do not present the on average approximation performance due to their identification results in Fig. 7 are not nearly close to the objective reference.
This outcome defines each analyzed identification model performance range, showing that the model c configuration has the lowest error of all of them.
The estimation-identification process aim is to describe the signal behavior evolution in time with the smallest possible error, so the model c is an adequate tool to provide relevant information about the reference parameters.
In this form, once model c is selected, it is used to evaluate five EEG signals from each set as references, having 25 in total. This evaluation allows obtaining their characteristic parameters, showing them in Fig. 9 as a polar representation that describes the bands of each signal type.  Remembering each set corresponds to a physiological condition, it helps to note the three signal groups: regular (A and B), without convulsions (C and D) and with seizures (E).

C. COMPARISON WITH OTHER ESTIMATION-IDENTIFICATION TECHNIQUES
From the comparison among models 'a', 'b', and 'c', it is found that model c has the best convergence performance. Now, we use reference signals from the same data set [28] to compare it with other known techniques used for the identification from estimated parameters.
In a Wiener sense, the LMS, RLS, and Kalman filters have implementations to model output variable signals from a BB perspective. They need the input and the reference to approximate, as well as a parameter initialization.
The signal tracking in amplitude through time at each sample point, without delays, is an example of how to see the identification, which is the main objective of our proposed method.
To obtain the other methods' performances, the LMS, RLS, and Kalman filters al implemented by using the and Kalman filters performance for identifying the set-C reference signal from the database in [25], with two amplified segments at the right. MATLAB R filter modules, initializing their parameters in such a way that they give the best results in convergence.
The LMS algorithm implementation considers a filter length of 23. This value defined after different tests with filter lengths from 1 to 50. Regarding the RLS, the used filter length and forgetting factor were 32 and 0.98, respectively. While for the Kalman module, the considered values were those predefined for the LTI State-Space Variable model source in a discrete-time domain.
An analysis without knowledge of the reference signal evolution may affect the identification if the initialization of the parameters is not adequate.
Studying the same reference signals from Fig. 7, the first comparison concerns the identification results obtained with each of the four filters mentioned above. Fig. 10 presents results only from the set C signal reference.
A second comparison is among the measured functionals of error from all five identifications with each method, showing the results in Fig. 11. These figures, Fig. 10 and 11, are helpful to note that model c allows a faster convergence and with the smallest error.
Last but not least, the third comparison involves the measured execution times using the tic and toc MATLAB R functions, executing the Simulink filter modules. Due to each execution taking a different time depending on the reference signal, Table 2 presents the average times obtained after identifying the first 100 signals from each set from A to E.
It is necessary to mention that these values in Table 2 do not consider the time expended in adjusting the filter gains, or in the parameter initialization for the different signals. Table 2 presents the mean times each technique takes to perform their identification task applied to 100 signals of each set of, from A to E. The results are in seconds. They reflect a time reduction of 43% from the Kalman filter to the RLS, a 51% decrease between Kalman and LMS, and a 56% reduction between the Kalman and model 'c.'

IV. CONCLUSION
This paper presented a new adaptive correction function for the estimation and identification process based on the Exponential Forgetting Factor (EFF), using the convergence error and the sign function.
The obtained adaptive identification general correction function described point-to-point Time-Varying signals (TV), such as Electroencephalogram (EEG).
From the three analyzed configurations, the model c was the one that achieved the best convergence level to the reference signal, being their results of the functionals of error the closest to zero.
By comparing model c with the LMS, RLS, and Kalman filters for the approximation of EEG signals, it gave the best results regarding time and convergence error. Furthermore, the complexity of the proposed method is O(N ), as the LMS.
On the one hand, it was possible to observe that the LMS filter needs more time to follow a reference. Once it is converging, its approximation presents a constant delay along with the complete signal.
On the other hand, the RLS keeps a tendency to the average of the signal values. At the same time, the Kalman filter ignores the small variations considering them as part of the noise. These two had the worst performance under the simulation conditions.
Finally, the results obtained using the proposed algorithm into EEG signals described their internal characteristics, viewed as a set of nonstationary parameters.
As future work, and because of the importance of the system identification and parameter estimation for describing stochastic references behavior, we consider extending this proposed method to other approaches and perspectives to look for an improvement in its description performance. Such as the Hammerstein mentioned in [30], which has been increasing in this specific area.
JOSÉ DE JESÚS MEDEL-JUÁREZ received the B.S. degree in aeronautical engineering from the Instituto Politécnico Nacional (IPN), in 1994, and the M.S. degree in electrical engineering and the Dr.Sc. degree from the Center for Investigation and Advanced Studies (CINVESTAV), in 1996 and 1998, respectively. He is a member of the Mexican Academy of Sciences, the Mexican Academy of Engineering and, the Academy of Geography and History. He belongs to the Legion of Honor and the National System of Researchers, as well as recognized as a Great Ibero-American Educator. He is currently a Professor with CIC-IPN. He is the author of six books, and eight dozen and a half articles in journals with strict arbitration (ISI, IEE, CONACyT, and other prestigious publishing houses). He directed 16 doctoral theses, receiving from one of them the highest recognition to a graduate student in Mexico. His research areas are digital filtering, control theory, and real-time systems. ROSAURA PALMA-OROZCO received the Ph.D. degree in advanced technology from CICATA-IPN. She is currently a Professor with the Escuela Superior de Cómputo, IPN. She is a member of the National System of Researchers and became a member of the Biophysical Society, in 2008. She has several publications in the area of stochastic linear filtering and recursive filtering, specifically in dynamic estimation. Her interests are focused on the computational and mathematical applications of filtering theory for analysis and study of complex systems represented in state space.
ROMEO URBIETA-PARRAZALES was born in Tapanatepec, Oaxaca, Mexico, in 1947. He received the M.S. degree from IPN, in 1990. From 1971 to 1987, he worked on diverse disciplines such as electronics, automatic control design, and industrial processes fields, for Mexican Government departments. From 1988 to 1996, he was an M.S. Professor with CIDETEC-IPN, developing research on intelligent control and computing applied to industrial processes. He is the author of three books, and a dozen and a half articles in journals with strict arbitration (ISI, IEE, CONACyT, and other prestigious publishing houses). Since 1997, he has been a Professor with CIC-IPN, working on industrial intelligent control systems and stochastic filters using FPGA. VOLUME 8, 2020