Uncertainty-Aware Energy Management Strategy for Hybrid Electric Vehicle Using Hybrid Deep Learning Method

Energy management strategy (EMS) is important to ensure energy-saving performance of hybrid electric vehicle (HEV). However, the power coupling property between different power sources, together with stochastic power demand fluctuation poses great challenges for EMS to achieve desirable performance in real-world scenario. This paper presents an uncertainty-aware energy management strategy for HEV. A speed predictor combining convolutional neural network and long short-term memory neural network is proposed to extract temporal features that could reveal speed change mechanism. Then an online self-adaptive transition probability matrix is constructed to estimate the speed prediction uncertainty. Tube model predictive control (tube-MPC) is finally used to solve the optimization control problem in a receding horizon manner. The robust set introduced in the tube-MPC greatly enhances the optimality and robust-ness of the control sequence under the scenario with speed prediction uncertainty. Simulations are conducted to verify the effectiveness of the proposed method. Results show that the speed prediction accuracy is 47.4% and 23.1% higher than exponential decay rate prediction model and autoregressive integrated moving average model respectively. Compared with traditional rule-based and MPC method, the proposed tube-MPC method could achieve 10.7% and 3.0% energy-saving performance improvement in average.


I. INTRODUCTION
Energy shortage is one of the main problems confronted by all countries around the world in the 21 st century [1]. As the main roles in modern transportation, internal combustion engine vehicles are not only the main consumer of energy, but also one of the main sources of environmental pollutants [2]. For traditional internal combustion engines, the efficiency of diesel engines is generally lower than 55%, and that of gasoline engines is even no higher than 45% [3]. Therefore, in order to promote the energy-saving and low-carbon development of transportation, in recent years, the whole automobile industry is transforming from ''oil-dependent'' to ''new energy-dependent'', and hybrid electric vehicle (HEV) is the first step in the transformation of transportation electrification [4], [5].
The associate editor coordinating the review of this manuscript and approving it for publication was Vitor Monteiro .
The primary goal of HEV is to improve the efficiency of power system and reduce fuel consumption. When the powertrain configuration is given, the most important factor on HEV fuel consumption is the power allocation ratio between the engine and the electrical system. Energy management strategy (EMS) needs to coordinate the power distribution between the engine and the electrical system under the constraint that the power demand is satisfied. For the same vehicle model and the same driving cycle, the fuel consumption corresponding to different energy management strategies can vary by 20% [6]. Thus, it is of great significance to study the EMS of HEV.
Generally speaking, EMS can be divided into two categories: rule-based and optimization based. In terms of rulebased methods, Dextreit et al. [7] divided the engine working area into high load, medium load and medium and low load areas, and determined the working mode based on the calculated demand power according to the driver's accelerator pedal opening and opening change rate. Li [8] proposed an instantaneous rule-based optimal energy allocation algorithm considering the impact of the timing of engine entering/ exiting the powertrain on the fuel economy. The rule-based control method has the advantages of low computation and easy implementation, so it has been widely used in modern mass-produced HEV [9]- [11]. However, the rule-based method also has obvious disadvantage. The designed control strategy usually has the optimality only under the specific condition used for rule calibration, while the real situation the vehicle will undergo is very complex. The rule-based control algorithm lacks adaptability under non-standard working conditions and the energy-saving performance will be greatly weakened in real-world scenario [12].
Optimization-based algorithms are another kind of EMS that have been widely studied in recent years, including particle swarm optimization (PSO), genetic algorithm (GA), convex optimization and the most representative dynamic programming (DP) [13]- [16]. For example, Zhang et al. [17] used the optimization results of DP to improve the rulebased EMS and achieved better energy-saving effect. Correa et al. [18] used DP to find the optimal power distribution principle between engine and motor, and designed a fuzzy logic controller on the basis of considering SOC balance and driving style recognition. Although DP can obtain the global optimal solution, it needs to obtain future working conditions in advance and can only be realized in offline manner [19].
To further enhance the energy-saving performance, advanced researches incorporate speed prediction to make EMS has prior knowledge about future driving cycle, thus making optimal decision not only focus on current state but also consider future situations [20]- [22]. In addition, by substituting DP with reinforcement learning (RL) or model predictive control (MPC), control policy could be operated online [23], [24]. For example, Teng et al. [25] presented a predictive online energy management strategy for a parallel HEV based on velocity prediction and RL. Menglin et al. [26] adopts deep learning to predict future power demand, thus proposing an online adaptive EMS and realizing efficiency improvement. However, the future speed or power demand is affected by so many stochastic factors (including traffic flow, weather, driver's driving style) that the prediction uncertainty is unneglectable in real-world driving scenarios [27]- [29]. Although above methods could enhance energy-saving performance when prediction is accurate, their robustness under circumstance of uncertain prediction is still in question.
Several advanced EMS researches start to focus on uncertainty introduced by real-world stochastic factors. For example, Shangguan et al. [30] proposed a robust energy management for plug-in hybrid electric buses considering the uncertainties of driving cycles and vehicle mass using Pontryagin's minimum principle method. He et al. [31] incorporated the uncertainty of renewable energy and load when designing EMS for hybrid energy storage system. However, above methods do not work in a predictive manner.
To exploit the advantage of predictive EMS while considering the stochastic factors in real-world driving [32], [33], Zeng et al. [34] proposed a stochastic model predictive control-based EMS using the vehicle location, traveling direction, and terrain information of the area for HEVs running in hilly regions with light traffic. The stochastic property of road grade is modeled as a Markov chain. Zhao et al. [35] proposed a stochastic model predictive control (SMPC) method to exploit the potential performance of dual-motor coupling powertrain, where the uncertainty of velocity prediction is captured and modeled through a novel Signed Markov Chain Monte Carlo method. However, the optimization result output by traditional MPC or SMPC may deviate from the optimal control policy or fail to realize stable control if the uncertainty is not properly quantified. The robustness of proposed method may still confront with challenges when facing strong stochastic factors in dynamic traffic scenes.
Heavy reliance on a model makes traditional MPC or SMPC susceptible to modeling error and external disturbances, often leading to poor performance or instability. Robust MPC (RMPC) addresses this limitation (at the expense of additional computational complexity) by optimizing over control policies instead of open-loop control actions. Tube MPC is a tractable alternative that decomposes RMPC into an offline robust controller design and online open-loop MPC problem [36]. Fig.1 gives an intuitive illustration of tube-MPC. Given an ancillary control and associated robust control invariant tube , a constraint-tightened version of the nominal MPC problem can be solved to generate an open-loop control input u * and trajectory x * . Even suspected to model uncertainty inference, the control sequence could constrain the system within tolerance scope , ensuring the optimality and stability of the solution. In addition, an uncertainty-aware speed predictor, which combines hybrid deep learning model with transition probability matrix (TPM) based estimation error quantification, is proposed to assist the application of tube-MPC. Thanks to the powerful automatic feature extraction ability of deep learning method, the proposed speed predictor could realize accurate speed prediction, which helps to enhance the stability and optimization of the result output by tube-MPC.
Generally, in this paper, a predictive uncertainty-aware EMS for HEV based on tube-MPC is proposed. Firstly, an adaptable online energy management framework is proposed for HEV, considering the uncertainties in speed prediction profiles. Secondly, a hybrid deep learning model is established to forecast the future information and the prediction errors are quantified via TPM. Thirdly, tube-MPC is adopted to optimize the control policy over a moving time window and the method's superiority is verified against some traditional EMS.
The remainder of this paper is organized as follows. Section II gives the general framework of the proposed uncertainty-aware EMS together with highlights of its innovation. Section III and section IV introduce two vital parts in the proposed EMS, namely hybrid deep learning-based speed prediction model and tube-MPC algorithm, respectively. The performance of the proposed method is well evaluated through simulation in Section V. Fig.2 shows the overall architecture of the predictive online energy management strategy. Firstly, at each step, history speed trajectory will be fed into the speed predictor to forecast the vehicle speed in a fixed-length future time window. Then, TPM will be combined with the speed prediction to give out the speed estimation error. Afterwards, tube-MPC is used to solve this tricky problem, which could ensure the stability of the control policy even under the uncertain speed prediction scenario. Finally, only the first of optimized control sequence u(h) is executed on the system and system will transfer to a new state after control execution. The information of new state will be fed into the MPC controller for the control optimization at the next time step.  Fig.3 gives the architecture of hybrid deep learning based speed prediction model. The input is first fed into the convolutional neural network (CNN) layer to extract highdimensional features that can reflect the internal characteristics of speed change. Then max-pooling layer is used to down-sample the output to the patch that could highlight the most valuable features. Another set of CNN and max-pooling layer is added to further distill the features. Afterwards, long short term memory (LSTM) layer is appended to incorporate the extracted features for time series prediction. Finally, fully connected layer is adopted to realize final speed prediction. In order to compensate the speed prediction error, a TPM is accompanied with the hybrid network to realize awareness of uncertainty. Details of each part are described as follows.

A. INPUT LAYER
The general idea of the model is to use data from last several intervals to predict the speed in the future time window. Let m denotes the input data length, p denotes the number of input variables, then the input can be expressed as: In this paper, the input variables include speed, acceleration, mean and standard deviation of speed over time window L (L < m), which means p = 4 here.

B. CNN AND MAX-POOLING LAYER
The calculation of CNN layer generally includes two steps [37]: first, convolution operation is carried out by multiplying convolution kernel with each small patch. Then nonlinear function is applied to realize de-linearization function. The values z l i,j,k at the position (i, j) in the k-th characteristic diagram of the l-st layer are calculated as following equation: where w l k and b l k are the weight coefficient vector and offset of the k-th convolution kernel of the l-th layer respectively. The notations w and b used are also used in following equations to represent weight and offset vector when introducing LSTM layer. x l−1 i,j is a piece of input data centered on position (i, j) in layer l −1. The activation function is used to introduce nonlinearity into CNN, so that CNN can detect nonlinear characteristics. Let a(·) be a nonlinear activation function, then the activation values a l i,j,k of the convolution features z l i,j,k can be calculated according to following equation: Here, ReLU activation function is used. The function of pooling layer is to summarize the features extracted from the convolution layer and compress the information, so that the feature range that CNN can extract could be much wider. Let the pooling function is denoted as pool(·), then for each characteristic graph a l i,j,k , we can get output y l i,j,k as follows: The kernel size of CNN and max-pooling layers are set as 2 × 2 in this paper.

C. LSTM LAYER
LSTM neural network is a special recurrent neural network (RNN), which is mainly to solve the problem of gradient disappearance in the process of long sequence training [38]. For each LSTM layer, it needs to process two states, namely hidden state a i m and memory state c t m . The candidate memory valuec t m of the m-th LSTM layer at a certain time t is: where a t−1 m represents the activation value obtained from the same layer at the last time. a t m−1 represents the activation value obtained from the last LSTM layer at the same time.
For the m-th LSTM layer, three gating operations, namely memory gate u m , forget gate where σ represents sigmoid activation function. Then, the memory value c t m at time t of m-th layer can be obtained through two ways. One is generated by the candidate value at current time, the other is obtained from the memory value at previous time. They are controlled by the memory gate u m , forget gate f m as follows: The hidden state a t m at time t of m-th layer can be obtained from the output gate o m according to following equations: The output sequence y t is obtained from the last softmax activation layer: where M represents the total layers of LSTM.

D. OUTPUT LAYER WITH UNCERTAINTY COMPENSATION
The output of the hybrid deep learning model is the predicted speed over a future time window with length n, which means: In order to compensate the uncertainty inherent in speed prediction, online uncertainty-aware correction based on TPM is adopted here. Suppose the real speed at future time instance ζ is v ζ , then: where v ζ p represents the predicted speed and v ζ e denotes the stochastic prediction error. Gaussian distribution is used to model the error property here, which means v ζ e ∼ N (0, σ 2 ). Considering the assumption that speed change confirms to Markov property, then TPM can be used to model the speed transition process. An online TPM updating mechanism with forgetting factor ϕ is adopted here to track the uncertainty trend [39]: where P i,j (L) represents the probability of the transition from v i to v j . p i,j (L) = 1 if a transition from v i to v j occurs at time instant L, else equals to 0. The forgetting factor ϕ ∈ (0, 1) is to determine the effective memory depth and control the rate of updating P i,j (L), which is set as 0.1 in this paper. After obtaining the TPM, speed chain can be sampled. In this paper, 10 chains with length 5 are sampled for each sampling instance, and the averaged standard deviation of the 10 chains is used to approximate the σ value. VOLUME 10, 2022 The deep learning based hybrid model can be trained using Adam method offline and used for speed prediction online. The TPM can be updated online to track the time-varying and stochastic change of speed estimation error. it also needs to be mentioned here that more variables can be incorporated into the input and output layers if any other predictive information is needed for subsequent energy management application.

IV. TUBE-MPC BASED PREDICTIVE ONLINE ENERGY MANAGEMENT STRATEGY
To apply optimal control, HEV system model is prerequisite. Suppose HEV System with uncertainty can be described by following equation: where x denotes the state variables. u represents control variables. w is uncertain factor and refers to speed uncertainty in this paper. The detailed model of investigated HEV can be found in Appendix. In this paper, . v h+n ] and u = [mode, T eng , n eng ] (mode refers to electric drive mode, engine drive mode and hybrid mode).
In this paper, tube MPC with one-step look-ahead robust set is used to solve the above optimization problem online, which can achieve close optimal control performance even under the circumstance of stochastic interference. Generally, Tube MPC includes two parts, namely offline part and online part. The standard tube-based MPC algorithm is solved absolutely offline, and the only online calculations are to confirm which partition the current state lies in. In online part, different control laws will be selected according to different states so as to ensure that the state will be controlled within tolerant boundary under uncertainty scenario. Table.1 demonstrates the steps of the algorithm [40], [41].

V. VERIFICATION AND SIMULATION
In this section, effectiveness of the proposed method will be evaluated through simulation. Generally, two aspects of the method are discussed here. First, the accuracy of the proposed speed predictor using hybrid deep learning method will be compared with traditional methods including exponential decay rate prediction model and autoregressive integrated moving average model (ARIMA). Then the superiority of the proposed method will be highlighted compared with rulebased method and MPC method without considering the speed estimation uncertainty.

A. SPEED PREDICTION PERFORMANCE EVALUATION
In order to evaluate the accuracy of the proposed deep learning based speed prediction method, two conventional methods are used as benchmark here. The first method is exponential decay rate prediction model, which can be expressed by: where v h represents the starting speed at time point h. ε is the exponential decay factor and different ε values correspond to different vehicle speed decay rates. Here, ε is set as 0.002 through trial-and-error. The second method is ARIMA (p, d, q), which is a commonly used linear forecasting approach, where p is the autoregressive form, q is the moving average window and d is the order of differencing. Here, d is assigned as 2, p and q are selected according to the auto-correlation and partialcorrelation graph of the data. Fig.4 compares the speed prediction accuracy of the three methods. The speed trajectory used for prediction is collected from a real-world driving in Beijing [42]. It can be seen from the figure that for exponential decay rate prediction model, the predicted speed shows an exponential attenuation trend in the future time window, but obvious differences exist between the real and predicted speed trajectories, especially when there is a sharp speed change around stop, which means this model is insufficient to capture the sophisticated dynamic change of speed variation. The average prediction error is 3.8km/h in this scenario. For the ARIMA model, through the fitting of moving window, the speed prediction deficiency of the above exponential decay rate prediction model can be solved to a certain extent and the prediction error is decreased to 2.6km/h. However, there is still room to improve the speed prediction accuracy. The final proposed hybrid deep learning method achieves highest prediction accuracy benefited from the powerful data mining and processing abilities of CNN and LSTM, which can extract and learn temporal features from historical data. The prediction error is only 2.0km/h in this scenario.

B. ENERGY SAVING PERFORMANCE EVALUATION
In order to verify the superiority of the proposed method in energy-saving performance, rule-based method and traditional MPC method without considering the speed estimation uncertainty are used as benchmark.

1) RULE-BASED METHOD
When the SOC of the battery is initially at a high level (i.e. SOC>70%), the power demand of the vehicle can be fully covered by the battery, thus the vehicle works in pure electric mode. If the battery cannot meet the power demand, then the engine starts. If the difference between the required power of the vehicle and the power in the lowest fuel consumption range of the engine is less than 5%, the vehicle works in the engine drive mode. If the power demand is much higher, the vehicle works in the hybrid acceleration mode. At this time, the engine works in the high-efficiency zone as much as possible, and the insufficient power is supplemented by the power battery. When the battery SOC decreases to 50% after a period of power consumption, engine will be switched on to work near the upper boundary of the optimal area. The part exceeding the demand power is used for battery charging. When the SOC reaches 75%, it will switch to pure electric mode, engine drive mode or hybrid acceleration mode again.

2) TRADITIONAL MPC METHOD
This method is similar to the proposed method but without consideration of the speed error and tube effect, which means the uncertainty compensation part in speed predictor and the w(h) in Eq. (15) is not considered. The state, action and cost function for this method is the same as that for the proposed method in this paper.
The speed trajectory used in Section V.A is also used here for energy-saving performance evaluation.  of different initial SOC. For fair comparison, the difference of SOC at the end has been weighted and incorporated into the final fuel consumption Fuel c .
It can be seen from Table.2 that the proposed EMS considering speed prediction uncertainty has the lowest equivalent fuel consumption among the three methods. Compared with rule-based method, the proposed method can achieve 11.4%, 8.5% and 12.3% energy-saving improvement under the three defined initial SOC scenarios respectively. Compared with traditional MPC method, the proposed method is superior in terms of depressing the frequent start/stop of the engine and fuel consumption. By incorporating consideration of the speed prediction uncertainty into EMS, the stochastic variation in future power demand is fully evaluated by the tube mechanism in tube-MPC, thus make the control policy more careful about the future reward that could benefit from current action. Fig.5 shows the working points distribution of the engine. It can be seen that for the rule-based method, there are many points locating in the high-power region with large fuel consumption rate. This is caused by the fact that when the SOC decreases to 50%, the engine will start to work with high power output not only satisfying the driving power demand but also charge the battery. This phenomenon greatly deteriorates the energy-saving performance under rule-based method. Traditional MPC method works much better than rule-based method, the points in high-power region is greatly depressed so the fuel consumption in this scenario is much lower. However, due to the uncertainty in speed prediction, the traditional MPC method cannot realize adaptive correction thus there are some points deviated from the optimal efficiency line. For the proposed tube-MPC method, the robustness and energy-saving performance are greatly enhanced by considering the uncertainty in speed prediction. Fewer points are deviated from the optimal working area.

VI. CONCLUSION
This paper proposes a predictive EMS framework considering uncertainty in speed estimation so that the energy-saving performance of EMS could be further enhanced. A speed predictor based on hybrid deep learning is proposed to extract temporal features that could reveal speed change mechanism. Then an online self-adaptive TPM is constructed to estimate the speed prediction uncertainty. Tube-MPC is used to solve the optimization control problem in a receding horizon manner. Simulations results show that the speed prediction accuracy is 47.4% and 23.1% higher than exponential decay rate prediction model and ARIMA model respectively. Compared with traditional rule-based and MPC method, the proposed tube-MPC method could achieve 10.7% and 3.0% energysaving performance improvement in average.
Considering the fact that RL based EMS is attracting increasing attentions in recent years and has demonstrated desirable performance in efficiency improvement, future research may include a comprehensive comparison between RL based EMS and our proposed method to further improve the optimality of the uncertainty-aware predictive EMS framework. In addition, more uncertainty factors like HEV fuel consumption and battery SOC will be incorporated to verify the generalizability of our proposed method in the future.

APPENDIX MATHEMATICAL MODEL OF HEV
The basic parameters of the investigated HEV is listed in Table.3. Fig.6 shows the topology of the power system of the investigated HEV. The main components of the hybrid system include power battery, engine, integrated starter and generator (ISG) motor, drive motor and planetary gear. The planetary gear is a power coupling device, which coordinates the power among ISG motor, engine and drive motor. The carrier of the planetary gear is connected with the engine, the sun gear is connected with the ISG motor, and the ring gear is connected with the drive motor. In addition, the power battery is also connected with the drive motor and ISG motor to provide power source. In order to simplify the representation, the controller and other related parts are omitted. When the system works, the controller will optimize and adjust the power distribution between the power battery and the engine, and integrate the power into the driving motor through the planetary gears for power output to drive the vehicle forward. The mathematical models of the main components are as follows.

A. POWER BATTERY MODEL
Here, the first-order RC equivalent circuit model of the battery cell will be firstly established [43]. Then the model of battery pack will be derived according to the pack topology, which connects the characteristic parameters between the pack and the cell.
The first-order RC model mainly includes open circuit voltage (OCV), ohmic internal resistance R s and RC network, which includes polarization resistance R p and polarization capacitance C p . According to Kirchhoff's law, the model satisfies the following state equation: (17) where V p represents the terminal voltage of the RC network. τ is the time constant and theoretically τ = R p C p . i b represents excitation current. C batt is the capacity of the power battery. V t represents the terminal voltage. In the actual application, the above state equation needs to be discretized. According to the control theory, the discretized state equation is: (18) where k represents the time point index and T represents the sampling period.
The battery pack is connected in series of 148 battery modules and each module is consisted of 3 cells in parallel. So the output voltage of the battery pack should be 148 times of the single cell. The ohmic resistance and polarization resistance of the pack should be multiplied by 148/3≈49.3 and the polarization capacitance of the pack should be divided by 49.3 on the basis of the cell calibration value. Fig.7 shows the two main characteristics of the engine, including the external characteristic curve and fuel consumption curve. For the external curve, it reflects the mapping from engine speed n eng to engine maximum torque T eng−max and can be expressed by:

B. ENGINE MODEL
The maximum torque can only be achieved when the throttle opening achieves 100%. In this paper, we assume that the engine output torque T eng is proportional to throttle opening γ and can be modeled by: The instantaneous oil consumptionċ can be interpolated from the consumption map based on engine torque and engine speed, as shown in Fig.7 For the efficiency map of the drive motor, like the oil consumption map of the engine, the efficiency of the motor η motor is also the function of speed n motor and torque T motor , which can be expressed by:

D. IGS MODEL
The ISG motor also has the limitation of maximum torque, which can be described by following equation: where T ISG and n ISG represent the output torque and speed of ISG motor, respectively. In addition, because the ISG motor in this paper is mainly used to start the engine and adjust the speed of the engine in the acceleration mode, its working range is relatively narrow. According to the engineering experience, the efficiency of the ISG motor η ISG is set to the fixed value of 90%.

E. ENERGY FLOW MODEL
The hybrid drive system studied in this paper has three working modes, namely pure electric mode, engine drive mode and hybrid mode. The hybrid mode can be further divided into acceleration mode and regenerative mode according to the discharging/charging state of power battery, as shown in Fig.8. In this subsection, the energy flow models under three modes are given. As shown in Fig.8(a), in the pure electric mode, the vehicle uses the power battery as the sole power source to drive the vehicle. The ring gear connected to the drive motor will also rotate. The carrier connected to the engine will stay still while the ISG motor rotates in the opposite direction but without power consumption or power generation. The power of the motor satisfies the following equation: (25) where P batt is the battery power and can be calculated by P batt = V t i b . η mech_p is the mechanical transmission efficiency of the planetary gear, which is taken as 94% here. P motor_in indicates the input power of the motor.
In the process of power transmission from motor input to motor output, there is power loss due to motor efficiency, which meets the following equation: P motor_in η motor = P motor_out = T motor n motor (26) P motor_out indicates the output power of the motor. In addition, according to the mechanical connection topology, the drive motor is connected with the wheel through the main reducer, so the following speed conversion equation is met: where u a is the vehicle speed. R is the radius of vehicle wheel. i 0 represents the transmission ratio of the main reducer, which is 6.14 here. The speed conversion equation determined by the mechanical connection topology is not only applicable to the pure electric mode, but also applicable to the engine drive mode and hybrid mode. According to the power balance equation of the vehicle, the required power of the vehicle P veh can be expressed as: where m is the mass of the vehicle. g is the gravitational acceleration. f is rolling resistance coefficient. C D is the aerodynamic drag coefficient. A is the frontal area. α represents the road incline, which is taken as 0 for simplicity here. η mech refers to the mechanical efficiency from the output end of the motor to the wheel end after passing through the transmission parts such as the main reducer, which is taken as 96% according to engineering experience. As shown in Fig.8(b), in the engine drive mode, the engine speed is coordinated by the ISG motor to drive the vehicle efficiently. According to the number of teeth of planetary gear and its connection relationship with various components, following kinetic equation can be obtained: n motor = 1.36n eng − 0.36n ISG T eng = 0.73T motor + 0.27T ISG (29) In this mode, the following power balance relationship is satisfied: T motor n motor η mech_p = P veh (30) P veh still meets the power balance relationship of Eq. (28). Considering that the power of ISG motor is provided by power battery, therefore: T ISG n ISG = P batt (31) As shown in Fig.8(c) and Fig.8(d), in the hybrid mode, the engine and drive motor work together to propel the vehicle, and the power balance meets the following relationship: P batt = P isg + P motor_in (32) In addition, Eq.(26)∼Eq. (31) are still valid in this mode. In essence, the hybrid mode is the superposition of pure electric mode and engine drive mode. Therefore, integrating the speed transmission relationship and power balance relationship under pure electric mode and engine drive mode will result in the energy flow equation under hybrid mode.
The hybrid mode can be further divided into acceleration mode and regenerative mode, which mainly depends on the state of the battery. If the battery SOC is high, the engine and power battery jointly drive the vehicle forward, which is called acceleration mode. If SOC is low, in order to maintain the battery energy, the engine will output power higher than the driving demand and excess power will be used to charge the battery. This is called regenerative mode. Although there are some differences in energy flow direction between above two modes, there is no difference in dynamic equation due to the same mechanical connection topology. The difference between the two modes lies in the sign of P batt , which is positive under acceleration mode and negative under regenerative mode.

F. OPTIMIZATION TARGET
HEV energy management should reduce fuel consumption as much as possible. In addition, to protect the power battery, the change of SOC should not be too drastic. Therefore, the cost function is defined as follows: where SOC sust represents the balanced SOC value, which is set as 60% here. χ and ϕ are positive weight coefficients. T S represents the sampling period. In addition, a penalty factor ρ is introduced into the cost function, which equals to 1 when engine starts or stops. This is to avoid frequent state switching of the engine. In particular, when the engine speed/torque or motor speed/torque exceeds the limit map during the state transition, an additional huge penalty (10 5 in this paper) will be introduced into the cost function to ensure that the engine and motor work within the allowable range.