The Sliding Innovation Filter

In this paper, a new filter referred to as the sliding innovation filter (SIF) is presented. The SIF is an estimation strategy formulated as a predictor-corrector that makes use of a switching gain and innovation term. In estimation theory, a trade-off exists between robustness to disturbances and optimality in terms of estimation error. Unlike the Kalman filter (KF), the SIF is a sub-optimal filter in the sense that it does not provide the optimal solution to the linear estimation problem. However, the switching gain provides an inherent amount of robustness to estimation problems that may be ill-conditioned or contain modeling uncertainties and disturbances. The paper includes the proof of stability and explanation of the SIF gain. Furthermore, the SIF is extended to nonlinear estimation problems using a Jacobian matrix, resulting in the extended sliding innovation filter (ESIF). The methods are applied to a linear and nonlinear aerospace actuator system under the presence of a leakage fault. The results of the simulation demonstrate the improved performance of the SIF and ESIF strategies over popular KF-based methods.


I. INTRODUCTION
Estimation strategies extract useful information from sensors with noisy measurements. This process is critical for developing and implementing accurate and reliable control systems. The most popular and well-studied estimation strategy is the Kalman filter (KF) which was introduced nearly 60 years ago by Rudolph Kalman [1]. The KF is formulated as a predictor-corrector estimator. State estimates are first predicted using knowledge of the system, controller input, and state values from the previous time step. These estimates are then updated using a gain which is calculated based on the state error covariance and measurement errors. The state error covariance is a function of the expected state error values squared. The KF gain provides the optimal solution to the linear estimation problem assuming that the system is wellknown and the system and measurement noise are zero-mean and Gaussian distributed, or known as white [1]. The KF is derived such that the trace of the state error covariance is minimized, thereby providing an optimal solution. The trace is used because it represents a sum of the estimation error The associate editor coordinating the review of this manuscript and approving it for publication was Zhen Ren . squared, or in other words, is the sum of the diagonal elements of the state error covariance [2].
The KF has been extended to nonlinear systems, where it no longer provides an optimal solution. The most common nonlinear estimators based on the KF are the extended KF (EKF), the unscented KF (UKF), and the cubature KF (CKF) [1], [2]. The EKF utilizes a first-order Taylor series expansion (or Jacobian) that approximates the nonlinearities about the states of interest [2]. The EKF is formulated similarly to the KF with the exception that the nonlinearities are linearized by forming Jacobian matrices. The UKF provides improved estimation results in terms of accuracy by utilizing an unscented transform and 'sigma points' to approximate the nonlinearities [3]. The UKF is slightly more complicated than the EKF in terms of derivation and computation, however it does not require any linearization. The CKF was also introduced to try and improve upon the estimation accuracy compared to the UKF [4]. It has been shown in literature that the CKF, which uses a cubature rule instead of an unscented transform to approximate the nonlinearities, is a special case of the UKF [1].
Depending on the application, KF-based solutions can provide good estimates, however most KF strategies lack robustness to modeling errors, uncertainties, and disturbances.
Other issues include numerical instability and matrix inversions [2]. A number of strategies have been implemented in an attempt to improve the robustness and stability of the KF [5]- [7]. Square-root formulations of the KF utilize QR decomposition and Cholesky factoring to ensure the state error covariance matrix is symmetric, which improves numerical stability [8], [9]. Another strategy involves imposing boundaries on the state estimate [10]. For example, given the upper bounds on the level of modeling uncertainty, the KF gain may be bounded to help improve estimation stability [10]. Other methodologies include the addition of fictitious system noise, and adding a fading memory strategy [11], [12]. Modifying the system noise matrix is done when less confidence is placed in the system model used by the filter, or in other words, when there is a lot of system uncertainty [11]. Doing this causes the filter to place more emphasis on the measurements, and less on the system model which may be incorrect resulting in inaccurate estimates and instabilities [11], [13], [14].
In an effort to overcome robustness issues with the KF, the H ∞ filter was introduced [15], [16]. This method utilizes boundaries based on the worst-case uncertainties to regulate the filtering gain which ensures the state estimates are bounded to within a region of the true state trajectory. However, a trade-off exists between optimality and robustness. Other methods have extended the H ∞ filtering strategy further by combining it with the KF and its nonlinear variants [17], [18].
Another strategy for improving estimation performance in the presence of modeling uncertainties and disturbances is to utilize variable structure techniques [19]- [21]. Variable structure systems are designed to include discontinuity hyperplanes. These hyperplanes divide the state space into regions. Inside these regions, the system dynamics are continuous [20], [22]. Sliding mode observers were introduced based on sliding mode and variable structure theory [12], [23]. The observer gain is calculated based on the innovation, and the error surface moves towards zero (ideally) [24]. Sliding mode observers define a hyperplane (i.e., a sliding surface) and apply a discontinuous switching force on the estimate to keep the estimate bounded within an area of the hyperplane [12], [25].
Based on variable structure theory and sliding mode observers, the smooth variable structure filter (SVSF) was introduced [26]. Similar to the KF, the SVSF is formulated as a predictor-corrector estimator but utilizes a different gain structure [26], [27]. The SVSF gain is a function of the measurement errors and a switching term [26]. The switching structure of the gain brings an inherent amount of stability to the estimation process as its bounds the estimates to the trajectory of the true state values [2], [28]. However, the SVSF as presented in [26] did not contain a state error covariance derivation. The SVSF was expanded in [2], [27], [29] to contain a covariance function which increased the number of useful applications for the filter. Other improvements were also presented that included the use of a chattering function for fault detection, higher-order solutions, and multi-target tracking formulations [12], [30]- [33].
In this paper, the sliding innovation filter (SIF) is proposed. The SIF is based on variable structure techniques similar to the SVSF and sliding mode observers, however its gain structure is simpler and it provides more accurate results while maintaining robustness. Note also that the SIF may be combined with control strategies for improved tracking performance and robustness to uncertainties and disturbances.
The paper is organized as follows. For completeness, the standard KF equations are provided in Section II, followed by the proposed SIF equations in Section III. Section IV provides the proof of stability for the SIF gain and a discussion on the sliding boundary layer. The simulation results are provided and discussed in Section V, followed by concluding remarks.

II. THE KALMAN FILTER
As described earlier, the KF provides the optimal solution to the linear estimation problem which is described by (2.1) and (2.2). The goal of any estimator is to obtain the true state value x k+1 using noisy measurements z k+1 .
The system and measurement noises are represented by w k and v k , respectively. A, B, and C represent the system (or process) matrix, input gain matrix, and measurement matrix, respectively. In (2.1) and (2.2), it is assumed that these matrices are fixed and do not change with time. The input to the system is defined as u k . For the KF and most estimation methods, it is assumed that the system and measurement noises are statistically zero mean with Gaussian distribution [2]. The system and measurement noise are generated using the covariance matrices Q and R, respectively.
It is well established in literature, however the main KF equations are summarized here for completeness [1]. The KF is formulated as a predictor-corrector estimator and is an iterative process. The prediction stage involves calculating the state estimates based on the previous state values and knowledge of the system, as per (2.3). The corresponding state error covariance matrix is calculated in (2.4) and is used in the update stage to calculate the KF gain in (2.5), and is also used to update the state error covariance as per (2.7).
The update stage is summarized by (2.5) through (2.7). The gain calculated in (2.5) is used to update the state estimates in (2.6) based on the measurement error (or innovation). The gain is also used along with the predicted state error covariance to update the state error covariance in (2.7).
Note that k refers to the time step, k|k refers to the updated values at the previous iteration, and k+1|k refers to the predicted values at time k + 1 based on information at time k. Equations (2.3) through (2.7) represent the KF estimation process for linear systems and measurements defined by (2.1) and (2.2), respectively. The process is iterative and repeats every time step k. Note that (2.7) is known as the Joseph covariance form, and is considered to be numerically stable.

III. THE SLIDING INNOVATION FILTER
This section describes the main sliding innovation filter (SIF) equations used for linear systems and measurements. Similar to the KF, the SIF is formulated as a predictor-corrector estimation method. The state estimates and state error covariances are first predicted using values obtained at the previous time step (or initialization), and then the state estimates and state error covariance are updated based on the measurements and correction term at the current time step. The correction term in this case is referred to as the SIF gain.

A. LINEAR SYSTEMS AND MEASUREMENTS
Similar to the KF, the prediction stage includes calculating the predicted or a priori ('before the fact') state estimateŝ x k+1|k , the predicted state error covariance P k+1|k , and the predicted innovationz k+1|k as per the following three equations, respectively:x The update stage includes calculating the SIF gain K k+1 , the updated or a posteriori ('after the fact') state estimateŝ x k+1|k+1 , and the updated state error covariance P k+1|k+1 as per the following three equations, respectively: Note that C + refers to the pseudoinverse of the measurement matrix, sat refers to the diagonal of the saturation term, sat refers to the saturation of a value (yields a result between −1 and +1), z k+1|k refers to the absolute value of the innovation, δ refers to the sliding boundary layer width, and I refers to the identity matrix (of dimension n-by-n where n is the number of states). Equations (3.1) through (3.6) represent the SIF estimation process for linear systems and measurements defined by (2.1) and (2.2), respectively. The main difference between the KF and SIF strategies is in the structure of the gain. For the KF, the gain is derived as a function of the state error covariance, which offers optimality [1], [2]. However, for the SIF, the gain is based on the measurement matrix, the innovation, and a sliding boundary layer term. Although the state error covariance is not used to calculate the SIF gain, it still provides useful information as it represents the amount of estimation error in the filtering process. Figure 1 provides an overview of the SIF estimation concept. An initial estimate is pushed towards the sliding boundary layer which is defined based on the amount of uncertainties in the estimation process. Once inside the sliding boundary layer, the estimates are forced to switch about the true state trajectory by the SIF gain.
To help illustrate the SIF gain, consider a system with two measurements (and C = I ), such that the saturation term in (3.4) could be expanded further: When multiplied with the innovationz k+1|k , as in (3.5), the state estimatesx k+1|k are updated with the following term: As shown in (3.8), the state estimates are updated with their corresponding innovation and sliding boundary layer term. The SIF gain effectively acts as a switching term, which forces the measurement errors to be bounded towards the true state trajectory. The sliding boundary layer δ is defined as a function of the modeling uncertainty and noise present in the estimation process. The width can be tuned to obtain the desired estimation result. A method to set the width is explained later in Section IV using the maximum uncertainties in the estimation process (4.23). Another starting point for tuning is to use the values of the measurement noise covariance. For example, δ = 10diag (R). The values can then be tuned by trial-and-error, grid search methods, or optimization techniques to reduce the estimation error.
For the cases when there are fewer measurements than states (m < n), artificial measurements can be created based on existing measurements to create a full measurement matrix. The structure could also be modified as per a Luenberger observer or other strategies as per [12], [34]. This process would be required to estimate parameters of the system matrix using the SIF.

B. NONLINEAR SYSTEMS AND MEASUREMENTS
Similar to the extended Kalman filter (EKF), the proposed extended sliding innovation filter (ESIF) makes use of the linearized form of the nonlinear system and measurement functions. For example, consider the nonlinear system function f x k|k , u k and the nonlinear measurement function h x k+1|k . Linearized forms of these nonlinearities may be calculated respectively using the following partial derivatives: The structure of the nonlinear SIF estimation process is similar to the linear SIF case, with the main difference being the formulation of the gain. The prediction stage consists of three main equations, as follows: Note that f refers to the nonlinear system function, F k refers to the linearized version of the system (Jacobian matrix or first-order Taylor series expansion) at time k, and h refers to the nonlinear measurement function. The states are first predicted in (3.11) before being updated in (3.15) using the innovation defined in (3.13) and gain defined in (3.14). The state error covariance matrix is first predicted in (3.12) before being updated in (3.16). Note that the gain (3.14) is also used to update the state error covariance (3.16).
The update stage consists of three main equations, as follows: (3.16) Note that H + k+1 refers to the pseudoinverse of the linearized measurement matrix (first-order Taylor series expansion) at time k + 1, and H k+1 refers to the linearized measurement matrix at time k + 1. Equations (3.9) through (3.16) represent the ESIF estimation process for nonlinear systems and measurements.

IV. SIF PROOF OF STABILITY
The SIF gain as defined in (3.4) provides robustness to modeling uncertainties and disturbances. This section provides the proof of stability. Consider a Lyapunov function M k+1 defined by the updated innovation: According to Lyapunov stability theory, the estimation process is considered stable if the following is held true: Equation (4.2) states that the rate of change of the Lyapunov function (4.1) must be negative, or in other words, the absolute magnitude of the innovation must decrease with time. Based on (4.2) and the definition in (4.1), we are able to find the following condition for stability: where T refers to the sample rate or time step. Equation (4.3) can be rewritten as follows: Simplifying (4.4) yields: Therefore, the estimation process is considered stable if the innovation is decreasing over time. To explore the stability further, we need to find the expectation ofz k+1|k+1 as per (4.5). First, consider the definition for the updated state errorx k+1|k+1 which is the difference between x k+1 in (2.1) andx k+1|k+1 in (3.5): x k+1|k+1 =x k+1|k − C + sat z k+1|k δ z k+1|k (4.6) Furthermore, note that the innovation may be defined as a function of the state error and measurement noise: Multiplying (4.6) by C and utilizing (4.7) transforms (4.6) into the following: Simplifying (4.8) yields: Here we define D = sat z k+1|k δ . We need to solve forz k+1|k to expand and simplify (4.9) further. Based on the definitions for state and measurement error (innovation), we find the following for the predicted state error and predicted innovation, respectively: Substituting (4.11) into (4.9) yields: We can rewrite (4.12) such that: where η is defined as an uncertainty vector: Equation (4.14) is unknown, however its expectation or expected value is zero provided the noise is zero mean (one of the main assumptions for standard estimation theory). In other words, E [w k ] = E [v k ] = 0. Taking the expectation of (4.13) yields: where E refers to the expectation or expected value. This means that the estimation process is stable as long as (I − D) CAC + is less than unity. This ensures that (4.5) satisfies the Lyapunov condition for stability defined by (4.1) and (4.2) provided that: For (4.16), there are two cases to consider. The first case is when CAC + is positive: The second case is when CAC + is negative: Based on the fact that D should be larger than zero and less than 1, D is contained and defined by: Based on the conditions of (4.5), (4.16), and (4.19), the SIF is considered stable as the Lyapunov function defined by (4.1) and (4.2) is satisfied.
Finally, note that the uncertainty vector defined by (4.14) may be used to help set the sliding boundary layer width. In order to reduce the effects of chattering (or high-frequency switching), the width should be set larger than or equal to the following: The sliding boundary layer width can be set using (4.20), where |z| max refers to the largest innovation and η max represents the largest uncertainty vector. To find the largest uncertainty vector, we set D = 0 which occurs when the innovation is zero or the boundary layer is infinitely large. In this case, we modify (4.20) to consider maximum values: Note that (4.21) may be simplified further to yield: Based on (4.20) and (4.22), we can now define a condition for the sliding boundary layer: The above condition can be used as a starting point for tuning the sliding boundary layer width. Note that if the sliding boundary layer is set wider than the uncertainties, the estimates will be smoothed. If the width is set smaller, chattering will occur which represents high-frequency switching due to the SIF gain.

A. LINEAR AEROSPACE SYSTEM
To demonstrate the robustness of the proposed estimation strategy, the KF and SIF are applied on a linear system with noise. The studied system is a type of aerospace flight surface actuator, referred to as the electrohydrostatic actuator (EHA). It has been well-studied and presented in literature [35]- [37]. A simplified linear EHA model was formulated in state space where the states of interest refer to position, velocity, and acceleration [2], [26]. The model parameters were found through experimentation of an EHA [26], [38]. The linear form of the system and measurements are described using the following state space equations [26]: where the sample rate T is defined as 1ms, k is the time step, C refers to the measurement matrix which in this case is an identity matrix of dimension m × m or 3 × 3, and u is VOLUME 8, 2020  the controller input for the system (defined in Fig. 2) that drives the desired trajectory. The system and measurement noises (w and v) are normally distributed with zero mean and covariance's Q and R defined by ( The three states in (5.1) represent the position, velocity, and acceleration of the linear actuator, respectively. The initial state values, measurements, and estimates were set to zero. The initial state error covariance values were set to P 0|0 = 10Q. The sliding boundary layer width was manually tuned to yield the smallest estimation error, and was found for this simulation to be δ = 0.05 1 0.5 . The simulation was coded in MATLAB.
The results of applying the KF and SIF strategies on the linear EHA are shown in Fig. 3. As expected, since the system is linear and well-known, the KF yields better results in terms of root mean square error (RMSE) under normal operating conditions. Note that RMSE is defined by (5.5) where n is the number of time steps. The results are summarized in Tab. 1.
In an effort to demonstrate the robustness of the SIF strategy, consider the case when the system has a fault injected half-way through the simulation (at t= 1 sec). In this case,  the linear system state equation used by the filters is changed: The results of the modeling uncertainty and its effects on the filters are shown in Fig. 4. The model mismatch at 1 second causes the KF to deviate from the true state trajectory, yielding poor estimates of the true position. The SIF was still able to perform relatively well, and was bounded to the true state trajectory due to the switching effects in the SIF gain.
The RMSE results for the faulty case are shown in Tab. 2. The SIF performs only slightly worse than the normal case. However, the KF is unable to recover from the modeling uncertainty and yields worse performance. This was expected as the KF is derived based on the assumption that the system is known. Figure 5 further illustrates the presence of the modeling uncertainty. The KF was unable to recover from the system change, whereas the SIF provided a good estimate.

B. NONLINEAR AEROSPACE SYSTEM
A nonlinear form of the EHA system is studied to further demonstrate the robustness of the SIF [2]. In this case, the extended sliding innovation filter (ESIF) is applied and compared with the EKF. The ESIF is summarized by (3.9) through (3.16). The nonlinear EHA can be accurately described by the following state space (related to its position, velocity, and acceleration) equations [2]: 96134 VOLUME 8, 2020 x 2,k+1 = x 2,k + Tx 3,k + w 2,k (5.8) The differential pressure x 4,k may also be modelled using the velocity of the actuator and a number of friction constants: The sample rate of the system is T = 1ms, and the input to the EHA system is defined by: As described in [35], A E refers to the piston cross-sectional area, β e is the effective bulk modulus (i.e., the 'stiffness' in the hydraulic circuit), D p refers to the pump displacement, L represents the leakage coefficient, M is the load mass (i.e., weight of the cylinders), V 0 is the initial cylinder volume, and ω p refers to the pump angular velocity. Two main parameters that are affected by induced fault conditions in the EHA are the leakage coefficients L and Q L0 , and the friction coefficients a 1 , a 2 , and a 3 . The following table lists the known parameter values that were experimentally determined in [38].
There are four important states and parameters: actuator position, velocity, acceleration, and differential pressure. It is assumed that measurements are available for each state, such that the measurement matrix C is full or C = I : (5.12) To properly implement the EKF and ESIF, the first-order Taylor series expansion (Jacobian) of (5.7) through (5.10) must be solved, as per (3.9). The linearized system matrix F k is defined as follows: where F 32 , F 33 , and F 42 are as follows: k sign x 2,k The system noise covariance matrix Q was defined based on tuning the experimental setup in [35], and in this simulation was set to: The measurement noise covariance R was set to 10 3 Q. The system noise w and measurement noise v were defined as normal distributions with covariances Q and R, respectively. For both strategies, the initial state estimatex 0 and error covariance matrix P 0|0 are respectively defined as follows: The sliding boundary layer was manually tuned to yield the smallest estimation error, and was set to δ = 10 −3 × 1 5 10 10 9 . The angular pump velocity ω p was set to a square wave with ±100rad/s, and was used in (5.11) to generate the input used by the EHA system (as shown in Fig. 6). Note that the input signal refers to volumetric flow rate within VOLUME 8, 2020 FIGURE 6. Input signal used for the nonlinear EHA scenario. Note that the input signal (5.10) was calculated based on a square wave representing the pump angular velocity. the hydraulic circuit of the EHA. Under normal operating conditions, both the EKF and ESIF were able to successfully track the three kinematic states and the differential pressure.
The EHA velocity and corresponding EKF and ESIF estimates are shown in Fig. 7. The differential pressure is shown in Fig. 8. The RMSE results are summarized in Tab. 4 and show comparable performance between the two strategies. The ESIF provided a slightly better position estimate in terms of error, whereas the EKF yielded slightly better estimates for the velocity and acceleration. The ESIF provided a significantly better estimate for the differential pressure (about three-times lower error).
In an effort to demonstrate the robustness of the ESIF strategy, consider the case when the system has a fault injected half-way through the simulation (at t= 4.5 sec). In this case, the leakage parameters L and Q L0 are changed as per Tab. 3 (to the faulty values). This simulates a major leak occurring in the hydraulic circuit.
The results of the modeling uncertainty and its effects on the filters are shown in Fig. 9. At 4.5 seconds, the fault is injected into the system. The model mismatch causes the FIGURE 8. EHA differential pressure trajectory and corresponding EKF and ESIF estimates. Initially, the EKF gave a poor estimate but quickly corrected (due to the initial conditions).  9. EHA differential pressure error based on corresponding EKF and ESIF estimates. The leakage fault is injected mid-way through the simulation. The ESIF is able to accurately compensate for the modeling uncertainty, whereas the EKF has difficulty overcoming the model mismatch.
EKF to deviate from the true state trajectory, yielding poor estimates of the true differential pressure. The ESIF was still able to perform relatively well, and was bounded to the true state trajectory. Similar to the linear case, this is due to the inherent stability caused by the switching effects in the ESIF gain. The RMSE results for the faulty case are summarized in Tab. 5.
The RMSE results further highlight the robustness of the proposed ESIF strategy. The sliding boundary layer δ and the switching effect of the ESIF gain keep the estimated states bounded to the true state trajectory. The sliding boundary layer δ may be tuned manually to improve estimation results, with initial values designed based on knowledge of the noise or uncertainties present in the estimation problem. In this scenario, it was initially tuned based on the amount of measurement noise present (for example, δ = 10 × diag (R)), and then optimized based on the estimation error.

VI. CONCLUSION
In this paper, a new estimation strategy called the sliding innovation filter (SIF) was proposed. The filter is called the SIF due to the fact that the innovation slides along a boundary layer close to the true states. The SIF gain structure is based on variable structure and sliding mode theory. It utilizes a switching term that allows the estimates to remain bounded to a region within the true state trajectory. This improves estimation robustness to modeling uncertainties, errors, and disturbances. A nonlinear formulation of the SIF, referred to as the extended sliding innovation filter (ESIF) was also presented. The SIF and ESIF were applied on an aerospace system and compared with the well-known Kalman filter and its nonlinear form, the extended KF. The simulation results demonstrated that the proposed SIF and ESIF strategies provide sub-optimal yet robust estimates for both linear and nonlinear systems under the presence of uncertainties and disturbances.