Grid Search Based Tire-Road Friction Estimation

The tire-road friction coefficient (<inline-formula> <tex-math notation="LaTeX">$\mu _{\max }$ </tex-math></inline-formula>) is an important input for vehicle dynamics control system and automated driving modules. However, reliable and accurate measurement of this parameter is difficult and costly in mass-produced vehicles and thus estimation is necessary. In this research, an innovative optimization based framework to estimate <inline-formula> <tex-math notation="LaTeX">$\mu _{\max }$ </tex-math></inline-formula> is proposed. The observation problem is formulated as a non-convex optimization. A novelty of the framework is that the <inline-formula> <tex-math notation="LaTeX">$\mu _{\max }$ </tex-math></inline-formula> can be accurately estimated in real time together with side slip angle as a by-product without requiring a good initial guess for the non-convex optimization. A key observation is that the time derivative of <inline-formula> <tex-math notation="LaTeX">$\mu _{\max }$ </tex-math></inline-formula> and side slip angle can be assumed as zero and computed based on measurement, respectively. This allows the observed variables to be updated at a relatively low frequency w.r.t. the solution of the optimization problem. During the interval between each two neighbouring updating time, the observer estimates the <inline-formula> <tex-math notation="LaTeX">$\mu _{\max }$ </tex-math></inline-formula> and side slip angle by integrating sensor information based on the last update. To find the global optima approximately, a grid search method is implemented for solving non-convex optimization. The estimation results from the proposed observer and a linearization based observer (lbo) are finally compared under various tire-road conditions with simulations and experiments. The results showed that 1) the proposed observer can always guarantee stability in a wide range of vehicle operations while lbo cannot. 2) w.r.t. root mean square of estimation error, the proposed observer performs overall better than lbo in <inline-formula> <tex-math notation="LaTeX">$\mu _{\max }$ </tex-math></inline-formula> estimation.


I. INTRODUCTION
Vehicle dynamics control systems and automated driving functions require information of the vehicle states and parameters for both feed-back control and decision making. The tire-road friction coefficient µ max is a critical variable indispensable for a wide range of vehicle dynamics control systems and automated driving modules, including wheel-slip and traction control [1], [2], lateral stability control [3], control allocation [4] as well as path tracking [5]. It can be expected that accurate information of the µ max can significantly enhance the performance and reliability of many automotive control systems. However, in real applications, µ max cannot be measured directly and has to be estimated in real time to avoid expensive sensors.
There is a considerable amount of research dedicated to the estimation of µ max . These studies can be mainly divided into following two categories: cause-based and effect-based methods [6].
The associate editor coordinating the review of this manuscript and approving it for publication was Huaqing Li .
Cause-based methods attempt to investigate measured factors leading to changes of friction on the road based on sensors (vision, laser and temperature sensors, etc.) and then estimate µ max . Andersson used optical sensors measuring infrared light at different wavelengths reflected from the road, such that different road surfaces (dry, wet, icy, snow surfaces) can be identified [7]. Holzmann applied cameras to classify road types by analyzing the texture of road images [8]. Similar studies are also shown in [9]- [13]. However, these methods require extra sensors installed on mass-produced vehicles which augment the costs. Besides, other factors (tire wear as well as hydroplanning, etc.) influencing µ max are difficult to be considered, so only an interval of the tire-road friction coefficient can be obtained which can be quite large sometimes. For example, on wet road condition, the interval may vary from 0.3 to 0.9 [14] which is not sufficient in many applications such as Autonomous Emergency Braking system (AEB), Electronic Stability Control system (ESC) [15].
In this article, an effect-based method focusing on vehicle lateral dynamics is presented to estimate the tire-road friction coefficient. A considerable amount of research dedicated to this topic can be found. Boßdorf utilized Extended Kalman Filter (EKF) to estimate the tire-road friction coefficient with a two track vehicle model [33]. Ray also applied EKF to observe vehicle states and tire forces, then the obtained force and slip angle are compared statistically with those that result from a nominal analytic tire model to select the most likely tire-road friction coefficient [34]. However, as a vehicle undergoes agile or extreme maneuvers, the tires are pushed to highly nonlinear operation regions resulting in nonlinear vehicle dynamics. Consequently linearization based approaches, like EKF, do not perform well in this case [35]. In [32], Grip proposed a Lyapunov-based observer to identify the vehicle side slip angle with friction adaptation, but the observer is based on a crucial assumption that the tire force is affine in a tire-related parameter, which is not consistent with the traditional tire model like brush model and magic formula [36]. In [31], Ding utilized recursive least squares (RLS) to estimate vehicle side slip angel and tire-road friction coefficient by minimizing the squared errors between estimated lateral acceleration and yaw acceleration of the vehicle and their measured values. However, for implementing RLS, measured lateral acceleration and yaw acceleration have to be linearized around the estimated tire-road friction coefficient and the side slip angle, which may lead to large estimation error when the tires come into highly nonlinear region.
Some researchers attempt to estimate tire-road friction coefficient by introducing total aligning torque information observed from widely installed Electric Power System (EPS) or Active Front Steering system (AFS), since self-aligning torque (contained in total aligning torque) comes into nonlinear region w.r.t. tire slip angle earlier than lateral tire force saturates [36]. As a result, only moderate excitation is needed to estimate road conditions [37]. Selected research results from literature are as follows. In [37] the authors design a nonlinear observer and argue that the estimation error converges when the front wheel slip angle and the road friction coefficient are both over-or underestimated at the same time, which suggests that their system may not be stable. In [28], a nonlinear robust observer is designed which is locally stable, and the region of the attraction cannot cover the whole operational area of the tires. In [30] a nonlinear least square method is proposed, but the global minimum may not be reached without an accurate initial guess.
Most related to this study is the optimization-based approach for tire-road friction coefficient estimation. Some researchers utilized moving horizon estimation (MHE) [38], [39] or nonlinear recursive least squares (NRLS) [30], [31]. However, as the optimization process in these methods is non-convex, there is no guarantee to find the global minimum without an accurate initial guess. Consequently, the observation error can be excessively large in some situations. Global optimization algorithms like differential evolution [40] (DE), particle swarm optimization [41] (PSO) are usually able to obtain the optimum solution of non-convex optimization independent of initial guess, but the inferior real time capability of these methods is a great barrier for their application on the tire-road friction coefficient estimation. The difficulty in solving non-convex optimization problems in real-time without requiring a good initial guess seems to be a great challenge for implementation of the optimization-based observers in practice.
Hence, in this research, an innovative optimization based framework to estimate tire-road friction coefficient with side slip angle as a by-product is proposed. It has to be mentioned that this current article is an improved version of previous work [42], we extend it by firstly including more introduction and explanation of the optimization based observer. Besides, experiments are conducted to validate the proposed observer. The overall content of this article is as follow: the observation problem is formulated as a non-convex optimization. A novelty of the framework is that the tire-road friction coefficient can be accurately estimated in real-time 1 with side slip angle as a by-product without requiring a good initial guess for the non-convex optimization. A key observation is that the time derivative of side slip angle can be computed based on measurement and the time derivative of µ max is assumed to be zero. This allows the observed variables to be updated at a relatively low frequency w.r.t. the solution of the optimization problem. During the interval between each two neighbouring updating time, the observer estimates the tire-road friction coefficient and side slip angle by integrating sensor information based on the last update. To find the global optima approximately, a grid search method is implemented for solving non-convex optimization. One advantage of the proposed observer is that, under no model uncertainty and measurement noise influence, the estimation error does not grow even when the system lacks observability. When observability requirement is satisfied, global asymptotic stability of the observer can also be guaranteed. The estimation results from the proposed observer and a linearization based observer 2 (lbo) are finally compared under various tire-road conditions with simulations and experiments. The results showed that 1) the proposed observer can always guarantee stability in a wide range of vehicle operations while lbo cannot. 2) w.r.t. root mean square of estimation error (RMS), the proposed observer performs overall better than lbo in µ max estimation.

II. VEHICLE AND TIRE MODEL A. VEHICLE MODEL
In this study, for simplifying the description of vehicle lateral dynamics without losing its main dynamic property, the vehicle is assumed to move on a flat, horizontal surface and the external forces, such as air drag, forces caused by slope and bank, 3 are omitted. In other words, only the tire road interaction forces are exerted on the vehicle. Therefore, a nonlinear single track model, with yaw rate and side slip angle as state variables as shown in Fig. 1, can be used to express the lateral dynamics of the vehicle. It considers lateral load transfer (caused by lateral acceleration), and assumes that µ max is piecewise constant and longitudinal velocity v x is slowly changing or positive constant. 4 The corresponding differential equations are described aṡ where where l f is the distance between the front axle to the center of gravity, l r the distance between the rear axle to the center of gravity, δ the front wheel steering angle, F yf the front axle lateral force, F yr the rear axle tire lateral force, ω the yaw rate, v y the side slip angle, a y the lateral acceleration and I z the moment of inertia of the vehicle, α f the front axle tire slip angle, α r the rear axle tire slip angle.

B. TIRE MODEL
A modified TMsimple [47] for calculating the tire lateral force is utilized in this study, such that the proposed method in 3 In case of slopes and banks in the real application, the work from [46] can be implemented to estimate them. 4 These assumptions are proposed for neglecting time derivatives of µ max and v x . Hence, if the vehicle accelerates or brakes aggressively during cornering, the assumption of v x for single track model in this work does not hold any more. Besides, if the µ max varies dramatically under a fixed road surface due to e.g. varying tire temperature caused by aggressive variation of v x , the assumption for µ max is invalid. this study can be better applied with less calculation burden. The modified TMsimple is described as follows: with where −A ln(1 − π 2B ) is the absolute value of wheel slip angle corresponding to the peak point of lateral force, F z the tire normal force, α the tire slip angle, µ max the tire-road friction coefficient. Descriptions of Y max (F z , µ max ), dY 0 (F z ), Y ∞ (F z , µ max ) and the difference between modified TMsimple and its original version are illustrated in Fig. 2. Besides, Y max and Y ∞ are both proportional to the µ max which, however, does not influence dY 0 . They are all related to normal force F z and are expressed as follows: where F z,nom refers to nominal tire load; µ 0 is the nominal value of the tire-road friction coefficient; a 1 , a 2 , b 1 , b 2 , c 1 , c 2 are derived based on the measurements of Y max (F z,nom , µ 0 ), Y max (F z,2nom , µ 0 ), dY 0 (F z,nom ), dY 0 (F z,2nom ), Y ∞ (F z,nom , µ 0 ), Y ∞ (F z,2nom , µ 0 ) under nominal tire load (F z,nom ) and an extra tire load 5 (F z,2nom ) as well 5 Usually this extra tire load is double the nominal tire load, but other values are also possible. The principle for selecting this extra tire load is to allow the variation range of tire normal force in the real application within the nominal tire load and this extra tire load. as nominal tire-road friction (µ 0 ), and more details of their expressions can refer to [47]. Based on modified TMsimple, we also describes the relationship between ∂Fy ∂α and α, ∂Fy ∂µ max and α with µ max being 1, respectively, as shown in Fig. 3. These relationships demonstrate in which range of α, the α and µ max are easy to be obtained. As can be seen, when α is small, indicating small signal excitation from the driver, the α is simple to be estimated while µ max is difficult to be separated. As α increases, referring to the increase of signal excitation, α is more and more difficult to be identified while µ max becomes easy to be observed. Since β can be expressed by α, they share the same property concerning determination/identification w.r.t. signal excitation.

III. OPTIMIZATION BASED REAL TIME TIRE-ROAD FRICTION COEFFICIENT ESTIMATION A. NOTION OF T -OBSERVABILITY
To facilitate the analysis of the observer performance, the notion of T -observability is defined.
Definition 3.1: Let ι i (t 0 ), i = 1, 2 be two initial states of system (1) and y i (t) be the output corresponding to the trajectory ι i (t).
Remarks 3.1: If T can be chosen arbitrarily small, the definition of T-observability is equal to the traditional definition of observability for a dynamic system [48]. Hence, the T-observability in this study is a looser condition for observability.
Remarks 3.2: Define ι consisting of ω, β and µ max . When the vehicle is always driven mildly with α from each tire staying closely around zero, the system (1) is not T-observable w.r.t. ι all the time, more precisely w.r.t. µ max . On the contrary, when the vehicle is driven aggressively, forcing α from each tire to be in the saturated region from t −T to t, the system (1) is also not T-observable w.r.t. ι at time t, more precisely w.r.t. β. 6 The expression of both f 1 and f 2 can be obtained by comparing (1) with (5). Besides, denote be the domain of all possible values of (β, µ max ). 8 Pick a positive number T as the optimization horizon, whose role will soon be clear.
What is particular about the problem in study is that the derivative of both unknown variables are either known or measurable. As a matter of fact, the value ofβ(t) can be measured and according to the assumptionμ max (t) = 0 whenever µ max (t) is continuous. Consequently, if the value of β(t − T ) is known, β(t) can be obtained by integrating f 1m (t) over time. For Then given p β = β(t) and p µ max = µ max (t), there arē Introduce the following error functions e 1 , e 2 : 6 β can be expressed by α. 7 They are a y , ω, v x . 8 β l , β r represent the left and right boundary of possible values of β, so do µ max,l , µ max,r for possible values of µ max . VOLUME 8, 2020

Remarks 3.3: Error function e 1 (t|p) represents the integration of squared difference between the measured f 1m (t) and f 1 (x, t) from vehicle model along the optimization horizon T . Besides, e 2 (t|p) expresses the integration of squared difference between the measured yaw rate and the one deduced from vehicle model along the optimization horizon T .
Then, it comes the cost function J : where w is the weight that reflects the importance of e 2 relative to e 1 . It is easy to see that if p β , p µ max are the true values of β(t) and µ max (t) respectively, then there must be J (t|p) = 0. Therefore one can find the set P(t) that contain the true value of x(t) by solving the equation J (t|p) = 0. However this equation may not even have a solution in practice as modelling errors and measurement noises are inevitable. Therefore, instead, the following optimization problem is considered: which is normally a non-convex optimization problem and is solved with a grid search method, which will be introduced later. Now, the proposed observer is introduced. Define D(t) as the smallest rectangular that contains P(t) and define a sequence of discrete time where t is the update period of D(t) from the optimization problem. Letβ(t) andμ max (t) be the estimated value of β and µ max at time t respectively. The observer dynamics is described by the following equations for any t ∈ [t k , t k+1 ): The overall diagram of observation process can be seen in Fig. 4. At the beginning of each interval [t k−1 , t k ), the computing unit collects all the data from the sensors and finish solving the optimization problem in (11) before time t k . Therefore at t k , the value of x * k−1 should already be available. It can be deduced that x * k−1 = x(t k−1 ) if the system is T -observable at t k−1 , which implies that the solution of (11) at t k−1 is unique. For the observer to operate in real-time, it is crucial that t is sufficiently large for solving the optimization problem defined in (11). After obtaining k−1 as well as the integration of the measured signalẋ(t). In reality, the model uncertainty, measurement noise and other disturbance will result in jumps in the x r (t) signal at each t k and may lead to growing estimation error in non-T-Observability situations. To alleviate these problems, we firstly conduct some post processing with T-Observability detection for avoiding misupdate of x r (t), which will be described later in section III-B.4. Then we introducex as the filtered signal of x r , such that the estimated statesx is continuous. Finally, the whole observer is summarized in a compact way in Fig. 5. Remarks 3.4: As mentioned in II-A, we assume the longitudinal velocity v x in this study is slow varying or positive constant, and based on this assumption, the proposed observer will operate according to the whole process shown in Fig.5.

2) STABILITY AND CONVERGENCE
If the modelling and measurements are assumed to be perfect, then it can be deduced that the observation scheme described above possesses some important properties regarding the stability and convergence of the estimation error. Lyapunov stability of the estimation error can be guaranteed even when the system lacks observability. In the more favorable situation where the trajectory is T -observable at t k for each positive integer k, global asymptotic stability follows. In the following, these issues will be discussed in detail.
At first, introduce the error variables: e It follows from 14 that for k ∈ Ṅ The following is also truė 81510 VOLUME 8, 2020 Since K is positive definite, it follows thatx r = 0 is globally asymptotically stable. Let 'diam' denote the diameter of a set. Then we have following properties.
k || ≤ r for any k > k 0 . Proof: First notice that at any t k+1 , the following always holds: therefore, Now consider two situations. In the first situation, suppose x r (t k ) ∈ D(t k ), then according to (11) is obtained. In the second situation, suppose x r (t k ) ∈ D(t k ), then from (11) it is easy to deduce that It follows from (19) and (20) that the sequence ||e Suppose k 0 as specified in Proposition 3.1 exits. Since both Proof: According to (13), there is Since the sequence ||e (a) k || is non-increasing, further noticing (15), (18) and thatx r = 0 is globally asymptotically stable, it can be deduced thatx = 0 is globally stable. Furthermore According to Proposition 3.1, the sequence ||e (b) (t k )|| is also non-increasing, further combining (15), it can be thus deduced that the equilibriumx = 0 is globally attractive. Since the equilibriumx = 0 is also globally stable, it can be concluded that equilibriumx = 0 is globally asymptotically stable.
Proposition 3.2 shows that the estimation error of x can be guaranteed not growing even if the system trajectory lacks observability. This can be achieved because the update law for x * k specified by (11) relies on integration of the sensor information during the period of time when the system trajectory lacks observability. In reality the sensor and computation error is inevitable, so one cannot rely on open-loop integration for too long time. It is crucial that the system recovers observability from time to time for an accurate observation.
It can be noted that while observability is lost, it is usually not easy to guarantee a non-increasing estimation error using some other schemes even without model uncertainties and disturbances. For instance, those methods [49] based on strict Lyapunov functions for time-varying systems usually require persistent excitation conditions. If the persistence of excitation is lost, then the Lyapunov function candidate may have positive time-derivatives, which indicates that the estimation error may actually grow over time.

3) GRID-SEARCH BASED OPTIMIZATION
The optimization problem defined in (10) is non-convex in some situations, especially when the vehicle is performing in areas of nonlinear dynamics [50]. As a result, any local optimization methods, such as Sequential Quadratic Programming (SQP) [51], may be trapped in local optima and thus fail to find the global optima. Global non-convex optimization is in general challenging. Stochastic optimization methods, like Differential Evolution [40] (DE), Particle VOLUME 8, 2020 Swarm Optimization [41] (PSO), are possibly good choices. However, if the optimization time or iteration steps are fixed, these methods usually cannot guarantee to output the optimal solution. Therefore, in this work a grid search method [52] is utilized to approximate the solution of the optimization problem. Grid-search is an exhaustive searching method, so the required search time for obtaining global optima can be calculated. Though it suffers from the curse of dimensionality, there are only two dimensions in this problem.
The grid search method is applied as follows: Considering a balance between numerical precision and computation efficiency, a double discretization is implemented. At first, the parameter domain is discretized into a grid with adaptive resolution and the value of the cost function is evaluated at each vertex of the grid to find the optimal solution. Once the parameter node corresponding to the optimal solution is found, one may choose a smaller region 1 ∈ that contains the best node. A discretization of the region 1 with higher resolution can be made for further searching the final optimal solution. This procedure can be carried out iteratively.  The details of the grid search algorithm are described in Fig.6 and Fig.7. In Fig.6, the parameter domain is discretized with adaptive resolution. Firstly begin with how the grid is adaptively discretized. As is known, at t k−1 , for a fixed F yf (t k−1 ) without being zero, the absolute value of the corresponding α f ,l (t k−1 ) based on linear tire model is smaller than α f (t k−1 ) from nonlinear tire model, see Fig. 8. Thus, on the basis of these two front tire slip angles, there are different side slip angles β cl and β shown in the following where both β cl (t k−1 ) and β(t k−1 ) are calculated based on single track model. Subsequently, by subtracting (22) from (21), is deduced. The following observer is utilized to estimate α f ,l (t k−1 ), which is based on [49], where C f α f ,l is used to replace the front axle lateral force F yf , and C f refers to front axle cornering stiffness based on linear single track model, Finally, there is following property: This property is further explained in Appx. A. Therefore, when α f ,l (t k−1 ) is larger than or equal to zero, the real side slip angle β(t k−1 ) should be also larger than or equal to β cl (t k−1 ) and thus the nodes in the corresponding areas with higher resolution ( β 12 ) are evaluated as shown in Fig. 6, and vice versa. After the parameter node corresponding to the optimal solution in is found, a smaller region 1 ∈ shown in Fig. 7 is chosen that contains the best node to further search the more accurate solution with higher resolution ( β 2 , µ max,2 ). Considering the measurement noise, model uncertainty as well as observability influence on the cost evaluations, the whole nodes (within 1 ) whose costs are smaller than k times of the new evaluated minimum cost are collected. Then, among these nodes the minimum and maximum β as well as µ max are chosen, which form the boundary of D(t k−1 ) ∈ 1 . Finally, in D(t k−1 ) the node x * k−1 closest to the x r (t k−1 ) is chosen which is calculated by integration based on the selected node from last grid-search.

4) POST PROCESSING WITH DETECTION OF T-OBSERVABILITY
As we discussed in section III-B.2, the estimation error of the observer does not grow in non-T-Observable situation with assumption of perfect modelling and measurements. However, in real application, this assumption does not hold and the estimation error may increase in non-T-observable conditions. Therefore, to improve the estimation results in post processing, it is necessary to detect when the system (1) is T-Observable and when not, such that the estimation results in non-T-observable condition can be better post-processed.
Before we talk about the checkable condition of T-observability, we would like to discuss its relationship with local observability for nonlinear systems. Normally, a nonlinear system such as (1) can be regarded as locally observable around some state variable ι 0 if its observability matrix w.r.t. ι 0 is full rank [53]. Hence, according to the definition of T-Observability, we can conclude that if the nonlinear system is locally observable w.r.t. ι 0 at some moment between t − T and t, it is T-observable at time t. Here, we can notice, the determination of local observability is related to state variables ι 0 . In other words, the way to check local observability may be state-dependent, which also applies to the T-observability. Since we do not have full access to the values of ι 0 (ω, β, µ max ), there may not be a direct way to check the T-Observability at time t. A more intuitive explanation is introduced as follows: as we wrote in the Remarks III.2, the information of α (can be expressed by unknown element β from ι 0 ) during the last T time period influences the T-observability of the nonlinear system (1). However, α is not available, meaning that there may be no direct way to detect T-observability of the nonlinear system in this study. Therefore, we utilize some indirect ways to check the T-observability. In nature driving situations, a vehicle losing stability with all tires saturated rarely occurs, indicating that it may not be necessary to find ways to detect whether the system (1) is non-T-Observable w.r.t. β. Hence, we focus on detecting when the T-Observability condition is satisfied for obtaining a reliable µ max . We propose a strategy to detect the T-Observability by utilizing a y andβ from the last optimization horizon T . In Fig.9, it is shown in more detail that the whole values of a y andβ from the last optimization horizon T are collected first. Then, if the last grid search is finished and meanwhile it's the time to update the results, the collected a y andβ will be compared with thresholds a y,thres andβ thres , respectively. During comparison, the numbers of a y andβ larger than the proposed thresholds are counted, separately. If these numbers are simultaneously greater than thresholds k n and p n , respectively, the system (1) is regarded as T-Observable w.r.t. that optimization horizon. In this case, the µ max is updated. Otherwise, it's frozen. W.r.t. β, it is always updated since the system (1) is always considered to be T-Observable w.r.t. β. In terms of the reason why we choose a y andβ for help detecting T-Observability, it is explained as follows: Normally, an update of µ max is not reasonable w.r.t. signal excitation for small values of |a y | < 1 m/s 2 . This is because the tires will be in the linear region even on very slippery road conditions such as icy roads during very small a y . In other words, when we start to estimate µ max , it is better that a y is not too small. Besides, a highβ indicates a fast-changing of side slip angle, which in turn hints a high level of excitation and that some of the tires may operate in the nonlinear region [32]. Furthermore, to increase the robustness and accuracy of the estimation results during the last optimization horizon, we define that only when the times in which a y andβ are larger than their thresholds are both greater than pre-defined thresholds, respectively, it is suitable to believe the T-Observability holds and thus to update the estimated results.

A. SIMULATION RESULTS
In this part, based on a nonlinear single track model including lateral load transfer, which is the same as the one used for the observer design, four different maneuvers -sinusoidal, step steer, double lane change as well as a random maneuverare simulated to show the tire-road friction estimation results. In the following simulation, the proposed optimization based observer is abbreviated as opti. Besides, a linearization based observer (lbo) from [43] is introduced as a comparison. The settings conducted for the opti and lbo for simulations can refer to Table 1 and Table 2, separately. It can be noted that the optimization horizon in simulation is 0.35 s, and the update period of the optimization result is 0.1 s, indicating that the grid-search optimization should be finished within 0.1 s. Besides, the search range (β r −β l ) of β and (µ max,r −µ max,l ) of µ max in are set as 0.3 rad and 1, respectively, which is enough for the observation search space in this study. 9 With respect to the discretization in , the length of edge in β direction consist of β 11 = 0.015 rad and β 12 = 0.004 rad, and the length of edge in µ max direction is 0.1. It has to be mentioned that search length of β means the diameter of the β set in 1 in the second discretization, so is the search length of µ max in 1 . 10 Furthermore, the nodes with cost function smaller than k times of the minimum 9 In the future, we will systematically test the real time capability of this observer. Then if the search range of β has to be set large due to the frequently aggressive application scenarios, the optimization process of grid search may need to be improved for balancing search accuracy and efficiency. 10 An example is given to explain the search length in detains: let 1 = [β ,l , β ,r ] × [µ ,l , µ ,r ], then search length of β and µ max in the second discretization are β ,r − β ,l and µ ,r − µ ,l , respectively.   11 constitute D(t). w.r.t. post processing, a y,thres andβ thres are set as 1m/s 2 and 0.0167rad/s in simulations, respectively. Besides, k n and p n are set as 8 and 2, separately. For lbo, the conditions for updating µ max are |a y | ≥ 1 and |β| ≥ 0.002. It can be noted that the µ max will be updated with a smaller threshold of |β| in lbo compared to that in opti. The reasons are expressed as follows: In opti, current and previous measurement data is collected, thus indicating more information of µ max is obtained. Hence, even for larger 11 Based on the evaluation of nodes in 1 . thresholdβ thres , opti still demonstrates good accuracy and convergent speed. On the contrary, lbo only utilizes current information, resulting in extremely low convergent speed in µ max estimation with the same thresholdβ thres . Therefore, in order to balance convergent speed and estimation accuracy, we set a smaller threshold forβ.
The simulation results are illustrated in Fig. 10 -Fig. 13 for the different maneuvers. Table 3 and Table 4 show the statistical analysis (mean value of estimation error (MVE) and root mean square of estimation error (RMS)) of the estimated β and µ max from opti and lbo. As can be seen, opti can   guarantee stability of µ max and β estimation in a wider range of vehicle operations than lbo. Besides, opti can estimate µ max and β overall better than lbo (in stable situation) in terms of RMS. In Fig. 10, β from opti can converge to the real β within 1 s while β from lbo needs more than 2 s. For µ max estimation, opti can estimate the real value much faster VOLUME 8, 2020  Step steer maneuver from low friction to high friction road condition: v x = 15 m/s. Under low friction road condition, the wheel steering angle jumps from 0 to 0.05 rad and then from 0.05 to 0.07 rad under high friction road condition.
than lbo, especially on low friction road condition where opti takes 1 s while lbo up to 9 s. W.r.t. RMS, results from opti are approximately the same (β) and half (µ max ) of those from lbo, respectively, demonstrating the higher accuracy. In Fig. 11, the step steer maneuver doesn't offer persistent excitation for lbo, so both β and µ max have deviation to the real values.
But this excitation is already enough for opti to obtain good estimation results, which is also illustrated by the statistical analysis from Table 3 and Table 4. In Fig. 12, opti estimates β as good as lbo. However, w.r.t. µ max estimation, opti can still accurately obtain the real results both on high and low friction road conditions while lbo can not on low friction road   condition. Besides, µ max from lbo converges slower than that from opti on high friction road condition. What interesting is that RMS of µ max from lbo is slightly better than that from opti, this is because between 5 s and 6 s both µ max estimation results from opti and lbo are frozen and estimated µ max from lbo is coincidently close to the real value, leading to lower RMS. In Fig. 13, the real µ max changes from high friction to low friction, finally jump to the medium road condition. From 0 to 5 s, both opti and lbo cannot estimate µ max accurately, because the excitation is too small, leading to the tire force staying in the quasi-linear region. From 5 s to 12 s, the road condition decrease to 0.3, which brings the tire force into highly nonlinear region, resulting in the instability of lbo. However, opti performs still quite well. Even when the road VOLUME 8, 2020  condition finally jumps to 0.5, lbo is still not able to converge to the real µ max while opti can always reach the real one quite fast. It has to be mentioned that when lbo becomes unstable, the estimated µ max is always being 1.15, which is set as the upper boundary of estimated µ max for lbo.   Table 5 and Table 6. The µ max as well as β are obtained as follows: The reference friction coefficients are inferred from full brake 12 on the concrete test track and lateral limit handling on the grinded ice test track, respectively. The reference lateral velocity is measured using a VBOX speed sensor [55] on concrete test track and is replaced by simulated lateral velocity based on a well parametrized nonlinear single track model 13 (considering lateral load transfer) on the grinded ice test track. Combining measured lateral velocity with longitudinal velocity, 14 β is obtained. The rest of the signals applied in the proposed method are obtained as follows: signals like lateral acceleration and yaw rate are directly measured; The wheel steering angle of front left and right is obtained based on measured steering wheel angle and a lookup table. More details about the used sensors are shown in Table 7. It has to be mentioned that the algorithm was not tested in real-time in the vehicle. On the contrary, we collect the measurement data from the vehicle and then utilize them on the Matlab/Simulink with the proposed algorithm based 12 The tire-road friction coefficient obtained from the vehicle longitudinal and lateral dynamics may be slightly different. But, in this thesis, it is assumed that they are the same. 13 Due to sensor reasons the lateral velocity can not be well measured on the grinded ice test track. Hence, the reference one is replaced by the lateral velocity from a well parametrized nonlinear single track model. 14 The longitudinal velocity is measured by a speed sensor [55].  [54].

FIGURE 15.
Step steer maneuver on high friction road condition: v x = 24 m/s. on an Intel Core i7-4700MQ processor. Hence, in this study, we did not systematically test the real time capability of the algorithm on the vehicle, although its framework has the real time capability theoretically. The parameters and settings conducted for the opti and lbo for experiments can be found in Table 1 and Table 2, separately. For opti, it can be noted that, compared to the optimization horizon (0.35 s) in simulation, it is set to 1 s in the experiment, such that the influence of model uncertainty and measurement noise on the estimation results can be better reduced. The other settings for optimization in the experiments are similar to those in simulation, except for β 2 = 0.001 rad, k = 1.2 and K = min(10, 0.3sign(β r ) β r ) 0 0 3 as well as some parameters in post processing. The reasons for the setting differences are as follows: W.r.t. β 2 , it is set in experiments larger than that in simulations for reducing calculation burden caused by the long optimization VOLUME 8, 2020  horizon; For k, theoretically speaking, even it is set to 1, the real (β, µ max ) should be within the region D(t). However, due to the influence of the measurement noise and model uncertainty, the real (β, µ max ) may jump out of the region D(t) if k is 1 or set too small. Thus in the experiments, k is set larger than that in simulations for guaranteeing D(t) to contain the real (β, µ max ); W.r.t. post processing,β thres is set as 0.025 rad/s in experiments, which is larger than that (0.0167 rad/s) in simulations. This is because during experiments the noise as well as bias superposeβ and a larger threshold can better reduce these influences, resulting in a better detection of fast β variation. Similarly, k n and p n are set larger for experiments than those for simulations, for the same reasons. The settings in K should also be correspondingly adapted for better estimation. For lbo, the condition to update µ max is adapted to |a y | ≥ 1 and |β| ≥ 0.01 considering noisy measurement. The reason why threshold of |β| for updating µ max differs between opti and lbo is already explained in IV-A.
The experimental results are depicted in Fig. 15 and Fig. 16 for the concrete test track, and Fig. 17 and Fig. 18 for the grinded ice test track. Besides, Table 3 and Table 4 show the statistical analysis (mean value of estimation error (MVE) and root mean square of estimation error (RMS)) of β and µ max estimation results from opti and lbo, respectively. It can be concluded that opti can guarantee stability in a wider range of vehicle operations than lbo. Besides, in terms of RMS, opti can estimate tire-road friction coefficient much better than lbo both on high and low friction road condition; On the contrary, opti performs overall similar as lbo in side slip angle estimation on high friction road condition but worse than lbo on grinded ice test track. On the concrete test track, in Fig. 15, opti can estimate tire-road friction (RMS 0.294) and side slip angle (RMS 0.246 • ) better than those (RMS 0.384 (µ max ) and 0.306 • (β)) from lbo, especially from 8 s to 12 s, during which the tire-road friction estimation from opti is still good while that from lbo becomes inferior. In Fig. 16, both opti and lbo perform well. From 10 s on, lbo illustrates even slightly faster convergent ratio than opti in tire-road friction estimation. Besides, the results of side slip angle from opti and lbo both show small deviation against the real measurements from 10 seconds on, which may be caused by model uncertainty w.r.t. the tire nonlinear region and measurement noise. On the grinded ice test track, in Fig. 17, opti can always output a stable tire-road friction coefficient while lbo cannot. Around 30 s, estimation result of µ max from lbo diverges and doesn't converge to vicinity of real value any more. What interesting is lbo with RMS (0.385 • ) can estimate side slip angle better than opti with RMS (0.772 • ). A possible explanation is that, when the tires come into highly saturated region, the cost function (9) is sensitive to the variation of the tire-road friction while robust to the tire slip angle (so is the side slip angle). Consequently, combined with model uncertainty and measurement noise, the size of D(t) in opti may be augmented, especially the length of β, leading to the increase of estimation error in β. However, we have to mention that the reference value of side slip angle on the grinded ice is not from measurements, indicating that our conclusion about side slip angle for opti and lbo on low friction condition may need more validations. In Fig. 18, similar performance like in Fig. 17 is shown, i.e. opti can estimate tire road friction better than lbo, while lbo can observe side slip angle better than opti. Readers may notice that in Fig. 18, the a y reaches roughly 2 m/s 2 , possibly forcing the tires operate in the nonlinear region on the grinded ice road, but the µ max in lbo appears to maintain as 1 all the time. This phenomenon is caused by the property of the lbo itself, which is explaind in Appx. B, since this study focuses on opti.

V. CONCLUSION
In this study, based on vehicle lateral dynamics, an optimization based framework for estimating µ max was proposed. A novelty of the framework is that the µ max can be accurately estimated in real-time together with side slip angle as a by-product without requiring a good initial guess for a non-convex optimization. When the system is T-observable from time to time (much weaker condition than traditional observable definition), the estimation error of side-slip angle and tire-road friction coefficient is globally asymptotically stable. By comparing the proposed method with a linearization based observer (lbo) under various simulations and experiments, it is demonstrated that: 1) the proposed method can guarantee stability of estimation in a wider range of vehicle operations than lbo; 2)W.r.t. root mean square of estimation error (RMS), the performance from the proposed method is overall better than that from lbo in µ max estimation.
In the future, we will not only test the application conditions of the proposed observer with more different maneuvers, but also systematically test its real time capability. Furthermore, we will also focus on estimating a non-conservative lower and upper boundary of estimation error by considering measurement bias, noise as well as model uncertainty. In this way, the estimated tire-road friction coefficient can be more reliably applied in practice.

APPENDIX A
For the nonlinear single track model, there is following dynamic equatioṅ which is already shown in [49] by assuming a constant longitudinal velocity, where α f is front axle wheel slip angle, Now, use C f α f ,l to replace F yf (α f , µ max ), a linear observer with α f ,l as the state is obtained: where (a − b)(cosδ) > 0 and C f < 0 which is the front axle linear tire cornering stiffness. Based on (26) and (27), the following property is proved: Property 5.1: where is deduced, whereα f ,e = α f − α f ,l . As is known, F yf (α f , µ max ) can be expressed asC(α f , µ max )α f , where C f ≤C f (α f , µ max ) < 0. Therefore, (29) can be rewritten aṡ Subsequently, is deduced. The equilibrium point of (31) Thus, it can be deduced that sign(α f ,e ) = sign(α f ,l ). Combined with sign(α f ,e ) = sign(β − β cl ), Property 5.1 is proved.

APPENDIX B
Overall, there are four situations when lbo is not able to estimate µ max accurately. The first is that lbo becomes unstable in some aggressive situations like in Fig. 13 (5 -20 s) and Fig. 17 (29 -45 s). Second, there are situations where although lbo is stable, the excitation is too small, forcing the tires work in the linear region, such as in Fig. 13 (0 -5 s). The third is that lbo is stable and the excitation is also somehow large, but the excitation is not persistent, 15 leading to inferior estimation results, see in Fig. 11 (1 -15 s). The last situation is what we would like to deal with here: lbo is stable with persistent and large excitation, but its convergent rate is somehow still low, such as in Fig. 18. This can be explained as follows.
In Fig. 18 (3 -5 s),μ max converges with extremely low rate and it appears like a constant. This phenomenon is caused by the property of the lbo itself. W.r.t. the update law ofμ max in lbo, it isμ max = k µ ∂â y ∂μ max ( ∂â y ∂v y ) −1 (a y − a y (v y ,μ max )) as shown in Table 2, in which k µ is constant. However, in terms of  Fig. 19. We can notice that, for a fixed α, the larger the µ max is, the smaller is the absolute value of ∂F y ∂µ max ( ∂F y ∂α ) −1 . Hence, on low friction condition, if the initial value ofμ max andv y is set as 1 and 0 respectively, 16 ∂F y ∂μ max ( ∂F y ∂α ) −1 is close to 0, leading to low convergent rate inμ max update during steering. This may hold for aggressive maneuvers depending on the selection of initialμ max . 17 Besides, v x (5.5 m/s on average) in Fig. 18 is also very small, which deteriorates the convergence of µ max . Actually, this phenomenon also occurs in situations ( Fig. 10 from 10 -20 s, Fig. 17 from 10 -20 s) where lbo is applied on the low friction condition with large initialμ max such as 1. Of course, the convergent rate in these situations also changes due to the different v x . Hence, if the initial µ max is set small (for instance, 0.35) for the same maneuver, resulting in relatively large ∂F y ∂μ max ( ∂F y ∂α ) −1 , the convergent rate will increase as shown in Fig. 20. Therefore, if we would like to force lbo to react fast on the low friction condition, we should set the initial value ofμ max small, which, however, is not reasonable for safety reason in the vehicle dynamics control systems. Some readers may also mention we can offer a large initialv y , indicating large initialα f andα r , which is actually not useful. This is because we can freezeμ max until the T-Observability holds in normal driving, butv y will update all the time and we cannot guarantee thatμ max starts to update with a large estimatedα f andα r . He is currently a Research Fellow with the University of Waterloo, Canada. His research interests include nonlinear control, motion planning, time-delay systems, with applications to automotive vehicles and robotics.
ARNO EICHBERGER received the degree in mechanical engineering from the University of Technology Graz, in 1995, and the Ph.D. degree (Hons.) in technical sciences, in 1998. From 1998 to 2007, he was employed at MAGNA STEYR Fahrzeugtechnik AG&Co and dealt with different aspects of active and passive safety. Since 2007, he has been working with the Institute of Automotive Engineering, University of Technology Graz, dealing with Driver Assistance Systems, Vehicle Dynamics and Suspensions. Since 2012, he has been an Associate Professor holding a venia docendi of automotive engineering.
CORNELIA LEX received the M.Sc. and Ph.D. degrees in mechanical engineering from the Graz University of Technology, in 2008 and 2015, respectively. Her Ph.D. thesis deals with tire-road friction estimation using standard on-board vehicle sensors.
She is currently an Assistant Professor of tenure track with the Institute of Automotive Engineering, Graz University of Technology. Her research topics include different scientific questions in vehicle dynamics with the Vehicle Dynamics Research Team that she leads. VOLUME 8, 2020