Energy Management of the Power-Split Hybrid Electric City Bus Based on the Stochastic Model Predictive Control

The energy management strategy of hybrid electric vehicles is of significant importance to improve the fuel economy. In this regard, two energy management strategies are designed for power-split hybrid electric city bus (HECB), which are based on the linear time-varying stochastic model predictive control (LTV-SMPC) and stochastic model predictive control based on Pontriagin’s minimum principle (PMP-SMPC). In the present study, the Markov chain and long short-term memory (LSTM) forecast demand torque and velocity respectively are applied to establish a combination forecast model. Then several processes, including linear approximation, processing simplified control model, the proposed nonlinear vehicle model is converted into a linear time-varying model. Meanwhile, the energy management problem in a linear quadratic programming problem is solved. Considering linearization error and limitations of the quadratic optimization, Pontriagin’s minimum principle (PMP) is applied to optimize the nonlinear model predictive control. Based on the reference theory, the range of coordinated variables is derived, and the optimal coordination variable is searched by dichotomy to realize the rolling optimization of the model predictive control (MPC). Finally, the effectiveness of the proposed energy management strategy is verified by simulating several case studies. Obtained results show that compared with the rule-based (RB) control strategy, the fuel economy of LTV-SMPC and PMP-SMPC increases by 8.79% and 14.42%, respectively. Meanwhile, it is concluded that the two strategies have real-time computing potential.


I. INTRODUCTION
With the rapidly increasing number of automobiles worldwide, the limitations of petroleum resources and environmental pollution issues have become more prominent in the past few decades. Following the energy-saving and emissionreduction concepts, automobile manufacturers have adopted engine optimization and electrification technologies to meet the strict requirements of regulations. Before entering the all-electric vehicle era, hybrid technology has been widely The associate editor coordinating the review of this manuscript and approving it for publication was Qiuye Sun . concerned and applied as a transient solution to pass through the era of internal combustion engines.
In this regard, energy management strategy is a key factor to evaluate the performance of hybrid electric vehicles. Currently, energy management strategy mainly consists of strategies for management of the power grid [1], [2] and fuel economy and emission of the whole vehicle. The former strategy is mainly focused on the normal operation of electronic equipment in a complex grid environment to ensure the stability and reliability of the whole vehicle [3]. On the other hand, the main objective of the latter strategy is to improve the fuel economy and emission of vehicles. This is also the main topic of the present study. The majority of VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ conventional energy management strategies consist of rulebased control strategy, optimal control strategy based on the optimal theory, instantaneous optimal control strategy that can be calculated in real-time and energy management strategy based on the model predictive control. These strategies are discussed below. Rule-based strategies are generally known as the summary of engineers' long-term working experience. Typical representatives of these schemes include the thermostat method, power following method, engine optimal working point strategy and the fuzzy method [4]- [6]. Further investigations revealed that this method has superior characteristics, including reasonable real-time performance, simple and effective algorithm and simple implementation. However, this method has some drawbacks such as low overall performance in the vehicle and poor adaptability to different working conditions. Wirasingha et al. showed that when there are more than two optimization objectives, it is a challenge for developers to establish effective fuzzy control rules [7]. Montazeri-Gh et al. developed a multi-input fuzzy logic control strategy for power-split hybrid electric vehicles, which can adapt to different driving cycles. They showed that for Toyota Prius powersplit hybrid vehicle, the proposed strategy outperforms the rule-based control strategy from the fuel economy and emission performance points of view [8]. Moreover, Montazeri-Gh et al. proposed a genetic fuzzy control strategy for plugin hybrid electric vehicles based on traffic state identification and prediction. According to the identification and prediction of the driving state, the intelligent controller can be effectively applied to switch between the optimal fuzzy controllers in each traffic condition. Then they applied the genetic algorithm to optimize the fuzzy parameters and rules. Accordingly, the effectiveness of the proposed method was verified to reduce fuel consumption and pollution emission [9], [10]. Conventional rule-based methods can be combined with artificial intelligence methods to reduce the computational expenses of real-time and online algorithms [11]. Moreover, obtained results from rule-based methods can be used as the initial data for other learning methods.
The optimal energy management strategy is obtained by solving the minimum (or maximum) cumulative cost of the energy management objective stage cost function under driving conditions [12]. The stage cost function mainly includes fuel consumption, state of charge (SOC) deviation from the expected target, pollution emission and driving experience (e.g. gear shift and braking) [13]- [15]. Typical optimization methods include dynamic programming (DP) based on Behrman's optimal principle and methods based on PMP [16]. It is worth noting that the DP method cannot be directly applied in engineering applications to the vehicle energy management. This shortcoming mainly originates from the necessity of having the driving cycle condition and a large number of calculations. Currently, the DP method is applied in some applications as the following. Firstly, it is used as a reference to evaluate other methods. Secondly, it is used to analyze and extract some vehicle control rules, including the conversion law of the transmission mode and shift schedule. Finally, it is applied to improve other methods such as the rule-based method [17]- [20]. PMP-based methods are capable of solving the optimal energy management problem by minimizing the Hamiltonian operator. Compared with the DP method, the PMP-based methods have less computational expenses, while obtaining a locally optimal trajectory [21].
In order to solve the problems that require the global operating condition of the vehicle, Wu et al. applied the stochastic dynamic programming (SDP) method to the energy management of parallel hybrid electric vehicles [22]. In the energy management method based on the SDP method, the existing standard driving conditions or vehicle driving history data are utilized as the sample of the random model and the statistical model of the driving cycle is establishes accordingly. Then, the DP method was used to solve the energy management problem represented by the statistical model [13], [14], [23], [24]. As a special case of SDP, the energy management method was proposed based on the shortest path SDP (SP-SDP). The main hypothesis of this method is that each cycle will eventually end in an absorption state such as vehicle flameout [25]. Currently, the biggest challenge of the SDPbased energy management strategy is a high computational expense, which cannot be justified even with recent improvements in the field of computational resources. With the advent and development of artificial intelligence technology, the neural network-based method was proposed to solve the condition sensitivity problem and dimension disaster problem of DP [26]. In this regard, Zhang et al. applied the supervised competitive learning vector quantization and proposed a neural network to predict real-time driving patterns [27]. Moreover, based on the information obtained from the assistance of vehicle-vehicle communication technology, vehicle-roadside equipment communication technology and other intelligent transportation systems, different models such as the chaintype neural network, multi-layer perceptron model and convolutional neural network have been proposed to simulate the velocity field. Then the predicted velocity is applied to the equivalent consumption minimization strategy (ECMS) to improve the energy management efficiency [28], [29]. Tian et al. used a PMP method to obtain the optimal control strategy under different cycle conditions [30]. They used the neural network to learn the optimal strategy and obtain the optimal SOC curve for different working conditions. Then the neural network is applied to select a corresponding optimal SOC curve as a reference for a part of the working condition information obtained through the intelligent transportation system. In this method, the selected optimal curve was used in the fuzzy logic controller. The foregoing discussions indicate that different methods usually use existing strategies or data to train the neural network offline and then use the trained network to predict some variables online. However, these methods cannot be categorized as online methods. Hu et al. proposed a fully data-driven energy management method based on the deep reinforcement learning (DRL) to reduce the computational expense in the vehicle controller. However, despite recent improvements in almost all areas, many challenges in the current intelligent transportation system remain unsolved [31]. Consequently, this method has a long way ahead for being applied in real vehicles.
Since the global optimization algorithm cannot be directly used in real-time control, investigating instantaneous optimization strategy has become a research hot spot. In this regard, the most typical control strategy of the instantaneous optimization method is ECMS, which is based on the instantaneous motor power consumption equivalent to the instantaneous fuel consumption to obtain the current equivalent fuel consumption. In this method, the optimal control sequence can be achieved by calculating the minimum value of equivalent fuel consumption [32]. Since the corresponding optimal equivalent factors are different when the driving cycle and vehicle state change, it is essential to change the equivalent factor dynamically with respect to the variation of the driving cycle and the vehicle state. This operation is normally carried out through a typical adaptive ECMS algorithm. Based on the SOC feedback, Ju et al. proposed a linear time-varying adaptive ECMS algorithm to achieve an adaptive regulation. Obtained results showed that the proposed algorithm is an effective strategy to achieve the power maintenance [33]. Furthermore, Feng et al. used a neural network, which operates on the equivalent factor together with the feedback SOC, to plan a reference SOC and adjust the equivalent factor dynamically. Further investigations showed that although the proposed method is an effective scheme to obtain the global optimal engine and battery energy distribution, it does not consider the influence of proportional-integral (PI) coefficient on equivalent factors [34]. Chao et al. divided a section of driving mileage into several parts. Then they took two bus stops as a unit and applied the particle swarm algorithm to optimize the equivalent factors of each part under different mileage and SOC conditions. Finally, the optimization results were transformed into a map to control the vehicle [35]. Accordingly, they demonstrated that ECMS could be effectively applied as an online method in real-time processes. However, challenges to obtain equivalent appropriate factors for different working conditions remained unsolved.
MPC has reasonable feedback correction characteristics so that can be introduced into the real-time control of hybrid electric vehicles. Li et al. showed that MPC has great potential to control automotive powertrains. Meanwhile, online computational expenses of this energy management strategy are relatively small when the comparison is made with the global optimization. On the other hand, MPC outperforms the rule-based strategy from the optimization efficiency viewpoint [36]- [38]. Luo et al. considered the nonlinearity of the engine and motor efficiency and proposed a control strategy based on the nonlinear control theory, which greatly improved the safety and fuel economy of engines [39]. Moreover, Cui et al. proposed the stochastic MPC to improve the fuel economy by predicting the driver's demand torque. Compared with the rule-based control strategy, the fuel economy in this method improves by 16.73%. However, this method has high computational expenses so that the corresponding calculation speed is relatively slow [40]. Recently, Xie et al. proposed a MPC method based on the optimal depth of discharge (DOD) of the battery. In this method, the back propagation (BP) neural network is initially applied to predict the vehicle speed, and then the reference SOC is constructed with the optimal DOD and it is combined with the PMP to achieve an optimized rolling solution. Finally, the effect of the energy management strategy is verified [41]. Liu et al. proposed the MPC control strategy based on the Markov prediction and obtained a better optimization effect by improving the prediction accuracy [42]. Borhan et al. simplified the nonlinear model into a linear one and used the linear time-varying MPC algorithm to solve the problem [43]. Despite simplicity, the proposed method had some shortcomings. More specifically, this method cannot predict the driver's demand torque and velocity, and the velocity of standard working conditions is directly applied to the prediction model. Consequently, actual working conditions cannot be simulated accurately.
In the present study, it is intended to utilize a combined prediction model to predict the driver's demand torque and speed. In this regard, the nonlinear vehicle model will be linearized and the nonlinear MPC will be transformed into the LTV-MPC to solve the problem. Based on the state quantity at the current moment and on the premise of ensuring the minimum cost function, it is expected to obtain the optimal distribution sequence of engine torque and engine speed in the predicted time domain. Then the first group of engine torque and speed in the sequence will be applied to the controlled system so that the state quantity at the next moment can be obtained through the whole vehicle model. Conducting this process repeatedly, the optimal torque and speed distribution sequence of the whole working condition can be obtained. Since the transmission system of the power-split HECB has strong nonlinearity and the corresponding cost function is a nonlinear and non-quadratic form of optimized variables, the linear MPC cannot be used effectively. In order to resolve this shortcoming and obtain a better real-time optimal energy management effect, the PMP algorithm is applied to the online optimization of the nonlinear MPC. Finally, the effectiveness of the proposed energy management strategy will be verified through simulations.

II. MODELING THE POWER-SPLIT HECB A. HECB CONFIGURATION
In the present study, the power-split hybrid system is considered as the research object. Figure 1 shows that the studied hybrid system consists of an engine, small motor MG1, large motor MG2, buffer locking mechanism (BLM), and double planetary row power shunt device. The engine is connected with the planet carrier (C1) of the front planetary row (PG1), the motor MG1 and the motor MG2, which are connected with the sun gear (S1) of the front planetary row PG1 and the sun gear (S2) of the rear planetary row (PG2). Moreover, the gear ring R1 of the PG1 is connected with the planetary  carrier (C2) of the PG2, and the coupling power is generated through the power output shaft. Moreover, the gear ring (R2) of PG2 is connected with the shell and the buffer locking mechanism is used to reduce the engine vibration and control the engine with the sun gear at the combined state of the planet carrier (C1).
Excellent matching of the main parameters of HECB plays an important role in achieving reasonable dynamic and economic performance from the HECB. Table 1 shows the main parameters of the object vehicle.

B. INTERNAL COMBUSTION ENGINE MODEL
Internal combustion engine (ICE) is the main power source in the HECB. Studies show that fuel consumption characteristics of ICE directly affect the vehicle performance and the fuel consumption changes with respect to the operating speed and load of the ICE. Generally, ICE has an optimal operating area, in which it has the highest efficiency and the corresponding fuel consumption rate is the lowest. The fuel consumption rate of the ICE can be represented as a function of the instantaneous engine torque T ICE and angular speed ω ICE , as follows: where BSFC denotes the brake specific fuel consumption.
It is worth noting that the ICE cannot operate below the idle speed.

C. ELECTRIC MOTOR MODEL
The introduction of the motor is the key to reduce the fuel consumption of HECB. The cooperation between the engine and electric motor makes the engine work in a high-efficiency area, and improves the fuel economy of the vehicle. The electric motor is the electrical power supplier of the test vehicle. When the electric machinery works on the electric motor, it takes power from the battery pack and output to drive the vehicle. On the other hand, when the electric machinery works as a generator, it absorbs power from the ICE from either redundant engine power, or regenerative braking power to charge the battery. The motor efficiency can be modeled as a function of the instantaneous electric motor torque T MG1 (2) and angular speed ω MG1 (2) , as follows: The power of the electric motor can then be calculated from the following expression: When the electric motor works as a generator or normal electric motor, β equals 1 or −1, respectively.

D. BATTERY PACK MODEL
The power battery is the main energy storage element in the HECB powertrain. It should be indicated that in the present study, variations of the battery performance along its lifetime and the impact of environmental temperature on the battery performance are ignored. Furthermore, the power battery is simplified as an equivalent circuit of the open-circuit voltage in series with internal resistance. Figure 2 illustrates the equivalent circuit of the power battery [33].  Based on the equivalent model, the battery power can be expressed in the form below: where U oc , I Batt , R Batt are the open-circuit voltage, current in the circuit, and internal resistance, respectively. According to Kirchhoff voltage law, variations of the battery SOC can be expressed as: where Q Batt denotes the maximum capacity of the battery. Combining Eqs. (5) and (6) results in the following expression: The power exchange between the battery pack and the electric motor can be mathematically expressed as follows:

E. DYNAMIC MODEL OF THE TRANSMISSION SYSTEM
In order to facilitate the establishment of the model and reduce the corresponding computational expenses, the transmission system of the double planetary row power split system is simplified as a pure rigid system. Figure 3 shows that the moment of inertia of each connecting shaft is concentrated and assumed equivalent to the output end of the power source.
The viscous damping of input and output shafts is considered in the dynamic analysis. The torque equations of the MG1 and MG2 axes, input shaft, and output shaft can be expressed in the form below: where T , L and I denote the torque, equivalent output end and equivalent moment of inertia, respectively. Moreover,ω and c are the angular acceleration and the viscous damping coefficient, respectively. Accordingly, T L and I L are the load torque at the equivalent output end and the equivalent moment of inertia at the equivalent output end (including gear ring, the whole vehicle, wheel, and half shaft), respectively. When the viscoelasticity of each shaft is ignored, the angular acceleration of each power element can be calculated as follows:ω In order to ensure the torque balance of the double planet row, the following conditions should be satisfied.
where k p is the characteristic parameter of the planetary row, which is defined as the ratio of the teeth number of the gear ring to that of the sun gear.

F. LONGITUDINAL DYNAMICS MODEL OF THE VEHICLE
Currently, only the longitudinal driving dynamics of the vehicle are considered in the vehicle dynamic model. In other words, this model does not involve the vertical vibration of the vehicle, and the handling stability during driving. Moreover, the slip between the wheel and the road surface is ignored. In order to represent the vehicle velocity, the longitudinal dynamic equilibrium equation of the vehicle is expressed in the form below [33]: where T drive and T brake denote the driving torque and the braking torque acting on the wheel, respectively. Moreover, F aero is the air resistance, F roll is the rolling resistance, F inertia is the acceleration resistance, F slopc is the ramp resistance, ρ air is the air density and v is the vehicle velocity. Moreover, f r1 and f r2 denote the first and second rolling resistance coefficients, respectively. Finally, α is the road ramp angle, and δ is the rotary mass conversion coefficient.

G. DRIVER MODEL
The main purpose of the driver model is to simulate the driver operation. Considering the opening signals of the accelerator and brake pedals, the vehicle is driven in accordance with the given cycle condition. In order to enable the vehicle to track the target vehicle velocity, the feedback controller based on the PI algorithm is introduced into the driver model. The driver model consists of the balancing torque calculation VOLUME 9, 2021 module and the PI feedback controller module. It should be indicated that the instantaneous driving balancing torque T resist with respect to the current vehicle velocity and acceleration could be calculated from Eq. (19), and the adjustment torque T can be calculated by the difference between the expected vehicle velocity and the actual vehicle velocity.
where T dmd is the vehicle demand torque. K P and K I are proportional coefficient and integral coefficient, respectively that are determined by trial and error method. Moreover, v r and v a denote the expected and actual vehicle velocities, respectively.

H. PHYSICAL MODEL
In this section, AVL Cruise is employed as an advanced simulation and analysis software to investigate the automobile power performance and fuel economy. Figure 4 indicates that the forward-facing simulation modeling is used in the present study to establish the power-split HECB model on the AVL Cruise platform.

III. ENERGY MANAGEMENT STRATEGY BASED ON THE STOCHASTIC MODEL PREDICTIVE CONTROL
The stochastic predictive control model is composed of the predictive model and MPC.

A. PREDICTING THE DEMAND TORQUE OF THE MARKOV CHAIN
Suppose that the driving behavior is simulated by a random process ω(·), where ω(k) represents its state at time k.
Variable ω(k) can represent the demand torque, acceleration, velocity, or a combination of these variables. Since all information can be measured through the installed sensor on the vehicle, so ω(k) can be measured at time k and unknown at an arbitrary time t > k. It is assumed that the driving behavior at time k is independent of historical information and it is only determined by current information, it can be considered that the function ω(k) is a Markov process. In this case, the Markov model can be used to predict the variation of ω(k) function. The driver's demand torque and vehicle velocity are selected as state variables. Applying the nearest neighbor method, these two variables can be discretized in the form below: where N T dmd and N v are the discrete number of demand torque and vehicle velocity, respectively. Transition probability of driver's demand torque can be described as the occurrence probability of demand torque at the next moment under the current demand torque and vehicle velocity, which can be mathematically expressed as follows: where P t is the conditional probability; P il,j denotes probability of the demand torque T Using the maximum approximation principle, the transition probability of the demand torque can be obtained through data statistics, in the form below: whereP il,j is the estimated state transition probability; M il,j is the number of times that the demand torque is transferred from T dmd at the current vehicle velocity v (l) ; M il is the total number of times that the demand torque transitions under the current vehicle velocity of v (l) .
In the process of data statistics, utilizing low-accuracy data rounding or discretization schemes may simply lead to the demand torque boundary value (e.g. the maximum or minimum value) only appears once (M il = 0). In order to simplify the calculation, it is stipulated that the demand torque will transfer to the same state at the next moment.
In the present study, the Chinese city bus cycle (CCBC) and Chinese-world transient vehicle cycle (C-WTVC) are considered as sample cases to extract the vehicle velocity and demand torque data. After discretizing data into the form of Eq. (22), the probability matrix of demand torque transfer corresponding to each vehicle velocity value is calculated by combining the knowledge of statistical analysis through Eq. (24). Figures 5-8 show the transition probability matrix of demand torque with asynchronous length under different vehicle velocities. Figures 5-8 show that the transfer probability of demand torque is distributed diagonally, indicating that there is a little 2060 VOLUME 9, 2021  difference between the current moment and the next moment. Meanwhile, it is observed that the diagonal vanishes as the prediction step increases. This may be attributed to the higher transfer possibility of the demand torque for longer prediction steps, thereby scattering more probability distribution.

B. LSTM VEHICLE VELOCITY MODEL
LSTM neural network is an improved network of recurrent neural network (RNN), which can effectively avoid the problems of gradient disappearance and gradient explosion in a network. By adding additional memory units, LSTM can identify which information should be retained and which information should be discarded and then store the selected  information for a long time. Meanwhile, the control information is transmitted from the previous time step to the next time step. Studies show that LSTM has a strong generalization capability and reasonable learning potential for both large and small data sets and has a strong advantage in dealing with nonlinear problems [44]. Figure 9 shows the structure of the LSTM cell.
Forgetting gate f t determines the information that should be discarded and retained with respect to the state C t−1 of the previous time step. Input x t determines the value for being update through σ and tanh, respectively. Moreover, it generates new candidate values for updating. After updating the parameters, unified states are updated together with the VOLUME 9, 2021 forgetting gate f t , and the updated unit state C t is output h t after applying tanh function with the output gate O t . These gates and states can be expressed through the following equations: where x t and h t are input vector and output vector respectively. Moreover, f , i and O denote the forgetting gate, input gate and output gate, respectively. C t−1 and C t are the unit state of the previous time and the current time respectively. h t−1 and h t represent the output of the hidden layer unit at the previous time and the current time respectively. σ is the sigmoid activation function, tanh is the tangent function, W and b are the weight and deviation vectors. Currently, different algorithms are applied to predict the vehicle velocity. For instance, LSTM can be applied to predict the vehicle velocity and perform an accurate prediction [44]. Accordingly, the LSTM algorithm is applied in the present study to predict the vehicle velocity. Figure 10 illustrates the obtained results in this regard.
It is observed that in the selected validation data set, the overall error falls within the interval of [−3,3]. It is found that the predicted results obtained from the neural network are consistent with the target value, thereby meeting the expected effect.

C. LTV-SMPC 1) BINARY POLYNOMIAL FITTING BASED ON THE LEAST SQUARE METHOD
In the analysis of experimental data in the form of y = F(x), it is essential to analyze the correlation between variables x and y from a group of experimental data (x i , y i )(i =  0, 1, . . . , m). Generally, there is a certain error between the constructed approximation function y = F(x) and the experimental data point (x i , y i ), which adopts Euclid norm δ 2 as the measurement standard, where δ = (δ 0 , δ 1 , δ 2 , · · · , δ m ) T . In order to achieve the best approximation to the experimental value of the function y = Q * (x) in the function space = span {ϕ 0 , ϕ 1 , · · · , ϕ n }, error sum of squares δ 2 should be minimized. This issue can be mathematically expressed in the form below: where Q(x) = a 0 ϕ 0 (x) + a 1 ϕ 1 (x) + · · · + a n ϕ n (x) and the least square theory is extended to the following bivariate polynomial function f (x, y), which is utilized as the fitting function.
f (x, y) = n j=0 m i=0 a ij x i y j The algebraic function Q(x) in Eq. (31) can be replaced by a polynomial function f (x, y). Under this circumstance, the sum of squares of fitting errors S is as follows In order to minimize the sum of squares of errors, the following condition should be satisfied: Accordingly, (m+1)×(n+1) equations are obtained and a linear system of equations with unknown coefficients forms. The solution of this system of equations results in the binary polynomial coefficient. Consequently, the binary polynomial fitting function, which best approximates the experimental value, can be established.
The engine and electric motor data are obtained from the bench test, which is both scattered data. In order to simplify the subsequent linearization, the least square method is used to fit the engine and electric motor data. The engine fuel consumption is approximated by a polynomial function with the fifth power of engine torque and the fifth power of speed. This function is expressed in Eq. (35).   Table 2 presents the coefficients of Eq. (35). It is worth noting that the root means square error between the fitting data and the original data is 0.053 g/s.
Tables 3-4 present the coefficients of Eq. (36). The root means square error between the fitting data and the original data is 0.028.

2) MPC
MPC is a real-time optimization algorithm, which can be applied to optimize the system performance in the finite time domain online. Accordingly, the prediction model should not be too complicated to meet real-time requirements. Moreover, the power-split HECB contains several components, where each component obtains complicated dynamic characteristics. It should be indicated that faster dynamic characteristics of the system are ignored, and the slow dynamic characteristics of the battery SOC and engine fuel consumption m fuel are considered. The SOC equation of the power battery (Eq. (7)), vehicle dynamics equation (Eq. (19)), engine fuel consumption (Eq. (35)), and motor efficiency equation (Eq. (36)) are established simultaneously. Accordingly, the system prediction model can be expressed as follows: where x and u are the system state variable and the system control variable, respectively. Moreover, d and y denote disturbance input and the output variables of the system, respectively.
Here, T dmd_N P and V dmd_N P are the required torque and velocity of the system in the finite time domain, respectively.
Eq. (37) shows that the whole vehicle model has nonlinear characteristics. Consequently, direct solutions of the vehicle model are relatively slow, and they simply fall into the local optimum. Therefore, the global optimal solution may not be obtained. In order to resolve this shortcoming, the nonlinear model is linearized and transformed into a linear time-varying model to improve the solving speed. In this regard, the Taylor series near the reference state trajectory is applied to expand the equation of state (37).
where ∂f /∂ξ is the Jacobian matrix of the nonlinear equation of state at the reference trajectory and ξ ∈ {x, u, d}.
Through the above mentioned linearization method, Eq. (37) can be processed into the standard form of the linear time-varying model predictive control as follows: The specific expression of each time-varying coefficient matrix is as follow: It is worth noting that variables in Eq. (39) are consistent with those in Eq. (37). Moreover, d,F, andG in Eq. (39) can be combined into one term as a measurable interference term. By discretizing the linearized formula, the following equation is obtained: where T s is the sampling period. The objective function of the linear model predictive control includes the error between the output variable y and the reference value γ . The control constraints are as follows: (41) When the objective function is optimal, the control variable u, control variable increment u, and output variable should satisfy their constraints.
where ε is the relaxation factor, Moreover, Min and max Max denote the upper and lower boundary values, respectively. Eq. (36) indicates that when the required torque of the driver is predicted based on the Markov model, the engine torque and the motor torque should always meet the equality constraints on the driver's required torque. All output parameters in the prediction area can be obtained by solving Eq. (40) iteratively.
Substituting Eq. (43) into the objective function, the form of the standard quadratic programming can be obtained as follows: Meanwhile, the following inequality constraints are satisfied.
By solving the standard quadratic programming equation of Eq. (44), the optimal control variables incremental sequence U * can be obtained. Moreover, the optimal control quantity at time k can be obtained.
Further investigations reveal that the working process of the power-split HECB obtains complicated nonlinear characteristics and the linear MPC cannot be used reasonably. Therefore, nonlinear MPC is established to obtain a reasonable real-time optimization effect. The corresponding performance objective function in the prediction time domain can be expressed as follows: (47) where m and s are the weights of the corresponding terms, N P is the length of the prediction horizon, and k is the k-th time step. The solution constraint conditions are as follows: where SOC Min and SOC Max denote the boundary SOCs of the battery. According to the optimization objective function of the energy control strategy, the constrained problem is transformed into an unconstrained one to solve the problem by using the minimum principle. The Hamiltonian function is defined as: where λ is the coordination state variable, which is obtained by iteration.
The canonical equation is as follows: Eq. (7) presents the expression of the equation of state. If the minimum principle is used, the following Eq. (50)-(54) must be satisfied.
Furthermore, the constrained SOC boundaries should be considered when minimizing the Hamiltonian function at the k-th preview horizon. For the upper boundary value, and for the lower boundary value, where SOC r k and SOC r k+N p are the initial and final SOC values over the k-th prediction horizon determined by the reference SOC.
The optimal control input can be obtained according to the following equation:

2) NUMERICAL SOLUTION OF THE PMP-SMPC
In conclusion, the objective function in the prediction horizon can be solved by using the minimum principle.
According to the study of Ju et al. [33], the value range of the coordination variable is mathematically expressed as follows: where η Max MG1 (2) , η Max inv and η Max Batt are the maximum efficiency of the motor, the maximum efficiency of the inverter and the maximum efficiency of the battery, respectively. Moreover, η Min ICE and Q lhv denote the minimum efficiency of the engine and the lower heating value of the fuel, respectively.
The dichotomy method is used to solve the coordination of state variables. Table 5 shows the specific steps for solving the coordination of state variable.
In order to describe the proposed PMP-SMPC method, Figure11 presents a detailed flowchart of the algorithm.

IV. RESULT ANALYSIS AND COMPARISON
In order to verify the superiority of the control strategy, the control strategy is constructed based on MAT-LAB/Simulink, and the vehicle model is established in AVL VOLUME 9, 2021  Cruise. The vehicle model and control strategy are jointly simulated through the interface.
The simulation analysis is carried out under CCBC and C-WTVC. To verify the effectiveness and real-time per-  formance of the proposed energy management strategy, three energy management strategies, rule-based, LTV-SMPC and PMP-SMPC are simulated and compared. Figures 12-18 show the simulation results, including the battery power consumption ( SOC), fuel consumption (FC), speed following, economic improvement rate (EIR) compared with RB strategy, and total simulation time. Among them, RB is the optimal curve control strategy. Figure 12 shows the speed following curves of three algorithms under CCBC and C-WTVC conditions. Figure 12 illustrates that the speed curves of the three algorithms are consistent at each time point t, and they can  follow the target speed reasonably. Figure12 (a) shows that the speed deviation of the LTV-SMPC control strategy is the largest, with a maximum deviation of 1.03 km·h −1 . Moreover, Figure 12(b) shows that the maximum speed deviation of the PMP-SMPC control strategy is 0.78 km·h −1 . Figure 13 illustrates the dynamic trajectory of battery SOC under CCBC and C-WTVC conditions.
In the CBCC condition, the final SOC deviation of PMP-SMPC is 0.0027, which is slightly higher than the RB value of 0.0006, while it is less than the LTV-SMPC value of 0.0073. During the whole cycle, the maximum deviation between the actual SOC of PMP-SMPC and the reference SOC is 0.0097, that of LTV-SMPC is 0.0139, and that of RB is 0.0184. The three strategies can keep the battery SOC within a certain range.
In the C-WTVC condition, the final SOC deviation of RB is 0.0009, the final SOC deviations of LTV-SMPC and PMP-SMPC are 0.0272 and 0.0143 respectively. In the whole cycle, the maximum deviation between the actual SOC of PMP-SMPC and the reference SOC is 0.0262. Moreover, that of LTV-SMPC is 0.0391, and that of RB is 0.0173. The obtained results show that compared with RB and LTV-SMPC, PMP-SMPC has a more stable trajectory, which verifies that PMP-SMPC can improve the SOC sustainability of the battery performance. Figure 14 shows the comparison curves of engine and motor output torque of LTV-SMPC strategy under different working conditions. Figure14 shows that when the vehicle speed is low, the motors MG2 and MG1 operate more, which reduces the engine performance in the high fuel consumption area. The performance of the engine is in a relatively small range, and the torque fluctuation is small. Figures 13 and 14 show that this strategy cannot only help the fuel economy but also maintain the sustainability of SOC. Figures 15 and 16 show the working point distribution of the engine and motor corresponding to the PMP-SMPC control strategy when the dual planetary row hybrid electric city bus is running under different working conditions. Figures 15 and 16 illustrate that the engine operates in the low fuel consumption area, and the brake specific fuel consumption of most operating points is lower than 215g·(kWh) −1 . However, the distribution of working points of motor MG1 is relatively scattered, and the efficiency of most operating points is higher than 0.8. The relatively low efficiency is due to the dual-motor driving the whole vehicle when the speed is relatively low and the demand torque is high. The main function of motor MG2 is to drive the vehicle at low speed and low demand torque, decouple the engine torque and brake the energy recovery, and the efficiency is relatively low.
Moreover, to evaluate the fuel economy and real-time performance of the proposed LTV-SMPC and PMP-SMPC control strategies, Figure 17 shows the comparison chart of the fuel consumption of LTV-SMPC, PMP-SMPC, and RB with the traditional bus.
It is observed that in CCBC, the fuel economy of LTV-SMPC, PMP-SMPC, RB, and the traditional bus are 17.01, 15.96, 18.65, and 28.68, respectively. Compared with traditional bus and RB, the fuel economy of LTV-SMPC is increased by 40.69% and 8.79%, and that of PMP-SMPC is increased by 44.35% and 14.42%, respectively. In C-WTVC, the fuel economy of LTV-SMPC, PMP-SMPC, RB, and the traditional bus are 18.56, 17.73, 20.11, and 30.03, respectively. Compared with the traditional bus and RB, the fuel economy of LTV-SMPC is increased by 38.20% and 7.71%, and that of PMP-SMPC is increased by 40.96% and 11.83%, respectively. Figure 18 shows the fuel consumption comparison of LTV-SMPC and PMP-SMPC control strategies under different operating conditions as the prediction step size increases. It is found that when LTV-SMPC and PMP-SMPC control strategies are adopted, as the prediction step size increases, the fuel consumption of vehicles tends to decrease first and then gradually increase. When N p = 60, LTV-SMPC can obtain the optimal fuel economy, and when N p = 40, PMP-SMPC can obtain the optimal fuel economy. As the prediction error increases with the increase of the prediction step size, the linearization process will produce a certain error. Therefore, the fuel consumption gradually increases with the increase of the prediction step size. Tables 6-7 show the simulation results of each strategy under CCBC and C-WTVC.
Under these two conditions, the simulation time of the energy management strategy based on LTV-SMPC and PMP-SMPC is longer than that based on RB. In order to further explain the real-time performance of the strategy, this paper also compared with the control strategy of solving MPC by nonlinear programming (NLP). In the present study, the NLP stochastic model predictive control (NLP-SMPC) has been applied to unmanned driving, and its real-time performance has been verified. The simulation time of the proposed strategy is similar to that of NLP-SMPC.
According to the above mentioned simulation results, the fuel economy of the energy management strategy based on optimization and the energy management strategy based on rules have been improved. However, the adaptability of the energy management strategy based on rules is poor, and different working conditions should be adjusted.

V. CONCLUSION
The main objective of the present study is to improve the vehicle fuel economy. Based on the structural characteristics of the power-split HECB, the combination forecast model is applied to predict the demanded torque and velocity. Moreover, LTV-SMPC and PMP-SMPC are established to minimize the energy consumption of the prediction horizon by linearizing the model and PMP solving the objective function in the prediction horizon. Finally, the numerical simulation is analyzed to verify the effectiveness of two types of control strategies.
The simulation results show that the LTV-SMPC and PMP-SMPC strategies improve the fuel economy and the calculation speed. Under CCBC and C-WTVC, the fuel economy of the two strategies is improved, and the changing trend of the battery SOC is consistent with the expected results, which verifies the correctness of the two strategies proposed in this study.
It is found that the fuel consumption of the LTV-SMPC strategy can be reduced by up to 8.79% compared with that of the rule-based strategy, and the fuel consumption of the PMP-SMPC strategy can be reduced by up to 14.42% compared with that of the rule-based strategy. Moreover, both strategies have the potential of real-time computing. In future research, we would like to explore the influence of emissions and control stability on the performance of energy management strategies.
XIAOHU YANG is currently pursuing the M.S. degree in automotive engineering with the College of Mechanical Engineering, Guangxi University, Nanning, China. His research interests include energy management and control strategy optimization of hybrid electric bus.
WEI HUANG was born in Nanning, Guangxi, China, in 1963. He received the Ph.D. degree in engineering mechanics from Beihang University, Beijing, in 2004.
He has been a Professor of mechanical design and theory, vehicle engineering, and a Ph.D. Supervisor of Guangxi University. He served as the Dean of the Department of Transportation, the Dean of the Department of Vehicles and Power, and the Dean of the Department of Vehicle Engineering, the Head of the degree program of mechanical design and theory, and the Head of the degree program of vehicle engineering, and a Responsible Professor of transportation and a Responsible Professor of vehicle engineering. In recent years, he has undertaken more than 50 projects, including six national science and technology projects, many national defense science and technology research funds, more than ten science and technology projects in Guangxi, more than ten science and technology projects in various departments and cities in Guangxi, and many enterprise science and technology projects. He is the author of three books, more than 80 articles, and more than 40 inventions. His research interests include reliability and optimization design, fault diagnosis and failure analysis, computer-aided design and simulation, and remanufacturing technology.
SONG ZHANG received the Ph.D. degree in automotive engineering from Tongji University, Shanghai, China, in 2012.
Since 2012, he has worked with Guangxi Yuchai Machinery Company Limited, Yulin, China. He mainly involved in control strategy development and system integration design of new energy vehicle power systems. VOLUME 9, 2021