Stochastic Model Predictive Control for Dual-Motor Battery Electric Bus Based on Signed Markov Chain Monte Carlo Method

With the increasing demand for battery electric buses, the dual-motor coupling powertrain (DMCP) shows great advantages, but it makes the energy optimization problem more complex. To solve the hybrid system optimization problem, a stochastic model predictive control (SMPC) method is proposed to exploit the potential performance of DMCP, where the most critical issue is to improve the prediction accuracy and handle the uncertainties. After analyzing the typical velocity profiles, statistical properties are used to develop a novel Signed Markov Chain Monte Carlo (SMCMC) method that can enhance the accuracy of velocity prediction by more than 50%, compared to conventional Markov Chain methods. Next, considering the uncertainties present in various driving scenarios, the development of driving cycle recognition model based on fuzzy logic control (FLC) is introduced; this method permits to identify the current category of driving cycle rapidly. Then, dynamic programming (DP) is adopted to solve the rolling optimization problems in each finite horizon online, including necessary constraints of dynamic response. Finally, simulation results demonstrate that the proposed energy management strategy can address various daily driving cycles well, and can improve the energy performance by 6% under a generalized combination of driving conditions compared to preliminary rule-based control.


I. INTRODUCTION
With the increasing concerns of environment protection and fossil fuel shortage problems, battery electric vehicles (BEVs) play a more and more prominent role, especially in the public transportation field. To further improve the comprehensive performance of BEVs, dual-motor coupling powertrain (DMCP) has been widely used recently, to coordinate the output power of two propulsion units to enhance the overall system efficiency and dynamic response [1]. Compared to the conventional single-motor architecture, simulation analysis under various typical driving cycles demonstrates that DMCP has a higher efficiency of electricity utilization [2] The associate editor coordinating the review of this manuscript and approving it for publication was Zhe Zhang . and can extend the BEV's driving range by 9% [3]. Though a single motor equipped with a multi-speed transmission can respond to varying load conditions, the improvement is always limited by the single propulsion component, especially in the crowded city driving scenario [4]. On the other hand, the DMCP can easily achieve an uninterrupted gear shifting process to improve ride comfort benefiting from its capability of continuous allocation of two motors' torque [5]. Of course, the additional motor will also increase the complexity of the related control, so an efficient energy management strategy of power split and mode-switch should be developed to achieve the promising potential of BEVs propelled by two motors.
The existing literature does not offer many results on this subject, as the DMCP architecture is still under development. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ In general, one can think of a DMCP as a hybrid powertrain, in which the combustion engine is replaced by another electric motor. Generally, the mathematical formulation of the power split problem in a DMCP system looks very similar to that in a hybrid electric vehicle (HEV), since they both have more than one power controller [5]. Hence, one can leverage the wealth of knowledge in the HEV energy management [6], [7]. Deterministic rule-based strategies, such as empirical heuristic schedules and fuzzy logic control (FLC), have been used to rapidly design a practical control strategy [8], [9]. These methods have satisfactory online response performance with a low computational burden and are flexible enough to adjust and modify although the calibration process is quite demanding. However, as the DMCP is always designed as a multi-mode driveline, namely it is a hybrid system with discrete mode and continuous power split ratio variables, it is hard to map out a rational schedule directly. In fact, rule-based strategies always need plenty of previous engineering experience and recalibration experiments afterward, which are time-consuming procedures. The HEV energy management and of related optimization problems have been the subject of many studies that have used optimal control methods such as dynamic programming (DP) [10], and numerical optimization methods like simulated annealing (SA) [11], particle swarm optimization (PSO) [12] and so on. These methods can solve complete optimization problem but are typically not well suited to online implementation because of computational cost and the need to know the future. The former problem can be solved by proper extraction methods, which can transform the online optimization problem to optimized control rules according to the offline global optimal results [13], [14]. In this way, energy management strategies for electric city buses, whose daily routes are always determined in advance, can be devised rapidly [15]. However, considering the intercity transportation demands with various driving patterns, the online instantaneous optimization method has attracted more and more attention. Stochastic optimal control algorithms are widely used to tackle such online problems and have been proved to be effective, such as stochastic dynamic programming (SDP) [16] and stochastic model predictive control (SMPC) [17], [18]. With high robustness and flexibility, the conventional MPC methods are very prevalent in handling industrial and engineering problems. Moreover, SMPC provides a probabilistic framework for MPC with stochastic uncertainty [19], indicating the random process of forecasting the future driving velocity profile in this case [20]. Thus, without the consideration of an expensive telematics-based approach, a proper prediction model capturing accurately and promptly the driving intention or future velocity change sequence is at the heart of SMPC [21].
Reviewing recent research on SMPC for vehicle energy management problems, various prediction models have been investigated. H. Borhan et al. assumed that the power demand of a vehicle increased exponentially over the prediction horizon and designed an intuitive exponentially velocity varying predictor [22]. H. He et al. designed a radial basis function neural network (RBF-NN) to forecast the passenger load variation, which proved to be an excellent time series prediction tool with real bus route data [23]. Markov chain (MC) methods are also widely adopted to solve velocity profile modeling [24]- [26]. C. Sun et al. summarized several effective velocity prediction methodologies and affirmed the value of MC and artificial NN methods [27]. However, some researchers questioned the reliability of the original data and the prediction results need to be further smoothed [24], [28]. In addition, facing the challenge of increasingly complex transportation scenarios, an adaptive velocity prediction method is necessary. S. Di Cairano devised a framework that combined the onboard learning of Markov Chain with transition probabilities that represent the driver behavior [29]. The simulation results seem reasonable, but the flexibility of the models still needs to be improved. In summary, though many approaches of SMPC have been developed, there is still much room left for improvement considering the accuracy of the velocity prediction results and the flexibility of the practical implementation facing various driving features.
To ameliorate the performance above, an online power management strategy based on SMPC with novel MCMC strategy and driving pattern recognition method is proposed. The main contributions of this study are as follows. Firstly, SMPC method with rational constraints has been successfully applied to solve the integrated energy management problem for an electric bus with DMCP. Secondly, in the velocity forecast process, a novel improved piecewise MC transition probability matrix is built to make the predictive velocity sequences more reasonable after analyzing the velocity profile distribution features. Then, intercity routes, including different typical driving cycles are considered and adopted in a systematic SMPC-based framework with a proper driving pattern recognition model. Eventually, simulation tests are conducted to validate the performance of the entire strategy. This paper is organized as follows. In Section II, a novel piecewise MCMC method for velocity prediction is introduced and compared to conventional procedures. Section III demonstrates the DMCP model and the relevant energy management strategy based on SMPC. Analysis and recognition processing of generalized intercity driving cycles applied in SMPC are illustrated in Section IV. Simulation results and comparison study are performed in Section V. A brief conclusion is drawn in the last section.

II. VELOCITY PREDICTION BASED ON IMPROVED MARKOV CHAIN METHOD A. BASIC MCMC METHOD
As noted above, the velocity prediction model is the core part of SMPC used to solve the instantaneous energy management problem of DMCP system. Benefiting from its stability and practicability, Markov Chain (MC) method has been proverbially implemented in modeling driving velocity profile in a certain future horizon. Considering the Markov process assumption, namely that the velocity variation is a random process whose future changes are determined by its most recent values, the basic idea of this method is to use the past data to determine the next status according to specific probability distributions [30]. The detailed procedures can be summarized as follows.
Step (1): For a typical driving cycle, the velocity variation is regarded as a time series corresponding to the Markov process. Round the fractional values v to finite discretized stages v k which are divided by p intervals ranging from minimum to the maximum velocity as shown in (1).
Step (2): Construct the state transition probability matrix T v ∈ R N * N of velocity varying according to the statistical sampling results. Suppose the current velocity value is v i = v (q ck ) , then the current time step in the profile can be found as {q c1 , q c2 . . . q ck . . . q cn }. List each value of the next steps in the velocity profile and identify whether v (q ck +1) equals to v k . Calculate the sampling frequency as the probability P i denoting that from v i transiting to v k and repeat the process to get the entire matrix T v as shown in (2).
Step (3): At each stage, convert the real current velocity to the nearest neighbor value from the discretized velocity set. Then based on the pretreated velocity and the transition probability matrix, the next velocity prediction can be obtained by comparing the corresponding probability distribution, namely the most probable velocity value can be determined. Recursively running this first-order Markov Chain model by inputting the latest value, a sequence of enumerated velocity can be eventually forecasted.
The primary MC method is easy to realize and will not lead to uncertain mutations. However, the limitation is also evident due to its monotone state transition principle. The real velocity is a series of random decimal values, but the forecasting results derived from MC are always some deterministic regular variables. Namely, the MC method has inherent errors and is easy to result in directional deviations. In order to approach the original characteristics, Monte Carlo simulation, which is a powerful statistical analysis tool, is always combined to form the so-called Markov Chain Monte Carlo (MCMC) method [31]. Its essential idea is using randomness to eliminate the possible deterministic elements in principle, which can compensate for the lack of MC. As the probability is processed, the following steps are only to draw a proper number of random samples from the posterior probability distribution, and then calculate the sample mean of those as the reliable estimation value.
Here we take the typical Chinese city bus driving cycle (CCBC) as the original dataset to show details of the prediction process above. According to the MC method steps, it is easy to construct the state transition probability matrix, as shown in Fig. 1. We can see that in most states, there would be a dominating distribution that can capture the major transferring features, which accounts for the rationality of the basic MC method. However, it is also clear that MC would ignore other factors and be hard to deal with the uniform distribution situation, which is the necessity for the Monto Carlo simulation process. To evaluate the prediction accuracy, root mean square error (RMSE) under each horizon is calculated, and the prediction results are illustrated in Fig. 2, where the value of MC method is obviously higher and more fluctuant. Namely, the effect of MCMC is better than MC but still not ideal enough. After increasing the number of samples, the predictions from MCMC perform a changeless trend in many horizons like results of zero-order hold (ZOH). This phenomenon will be analyzed and improved in the following parts.

B. FEATURES OF TYPICAL PROFILE ANALYSIS
Considering the intuitionistic features of the CCBC velocity profile, as shown in Fig. 3, the change of velocity seems like a combination of many repeating speed segments with a similar rise and fall trends. Suppose the current velocity is 40 km/h, we can easily find the velocities of next states, which are pairs of increasing and decreasing values. Complying with the conventional MCMC method introduced above, the transition probabilities with an interval of 1 km/h is [42(50%), 38(25%), 37(12.5%), 39(12.5%)] km/h. Then the next velocity should be 42 km/h based on MC principle, whereas the mathematical expectation should be 40 km/h assuming the number of samples in the MCMC method tends to infinity. This example provides an intuitive explanation of why the MC method always results in deterministic regular predictions with specific deviation and why the results of the MCMC method always remain constant.   4 demonstrates the distribution of current velocity and its variation in the next state under the CCBC cycle from a statistical perspective. We can see that the velocity variation shows symmetrical distribution features. Namely, at each velocity state, the probabilities of tending to increase or decrease are equivalent. Also, the magnitudes of change to both sides seem to be symmetrical. Hence it could be better to separate the positive and negative parts and calculate them individually to avoid equalizing each other.

C. PIECEWISE MCMC METHOD 1) PIECEWISE STATE TRANSITION PROBABILITIES
Due to the velocity variation symmetrical distribution, an improved piecewise MCMC method could be devised. The basic idea is to separate the original state transition probability matrix into two individual matrices representing the transferring trends in the accelerating and braking process, respectively. As shown in Fig. 5, the valid probabilities are all biased to only one side of the diagonal plane, which means the next value will keep its monotonic trend. Comparing to the distribution in Fig. 1, half of the misleading information could be avoided by the piecewise matrices. A proper acceleration state prediction model should be established to identify which  matrix the current velocity belongs to. We can merely classify all the acceleration states in the future horizon by current acceleration since it does not frequently change in a certain period, and this method could be called Classified MCMC. On the other hand, we can also present a Signed MCMC method indicating a forecast of the signs of acceleration in each state.

2) ACCELERATION SIGN PREDICTION MODEL
There are two salient characteristics in an acceleration sequence: it is a typical stationary time series and will be affected by additional external inputs. Thus, the nonlinear autoregressive neural network with exogenous inputs (NARX-NN) model seems to be one of the best choices to predict the sign of acceleration. NARX is a kind of dynamic recurrent neural network (RNN) that is good at discrete time series prediction [32]. The NARX can not only employ the past values of the same series but also take the current and past values of externally determined series that influences the series of interest into account. Here 70% of the driving cycle data is utilized for training, and 15% of the data is used to validate and test the network performance during and after the training process, respectively. Mathematically, the output value at k+ 1 state can be represented as: where y(k) is the previous values of the target itself and X (k) is the exogenous associated series; d i and d o are the input and output lags respectively. As shown in Fig. 6, using proper activation functions f * (· ) and biases b * the output result can be expressed as (4). To determine the most approximated weight factors above, there are two efficient training methods called Levenberg Marquardt (LM) algorithm and Bayesian regularization (BR) [33]. The training performance can be estimated by error autocorrelation, which can detect non-randomness in a data set and describes how the prediction errors are related in time. The basic idea is to calculate the correlation coefficient of a time series with itself, shifted in time, as shown in (5), and the value will be higher when those two periods resonate with each other.
For perfect training results of the prediction model, there should only be one nonzero value of the autocorrelation function, and it should equal to the mean square error at zero lag. The results are illustrated in Fig. 7. Here we can see that the correlations of BR, except for the one at zero lag, fall approximately within the 95% confidence limits around zero. Indeed, the BR performed better than the LM in terms of both prediction accuracy and insensitivity and is found to be far more robust and efficient. After selecting a suitable training method and further parameter adjustment, a reasonable neural networks prediction model could be established.

3) IMPROVED PREDICTION AND COMPARATIVE STUDY
The prediction results from improved MCMC methods are shown in Fig. 8, from which we can see that the predictive trajectories coincide with the real velocity values well. It is worth noting that the directional changes of predictive velocity could move towards the same trend as that of the real velocity. Namely, it will cause less unreasonable contrary to power demands in the future horizons.
The significantly improved parts are easy to understand since there is a premised assumption in most basic time series prediction models, indicating that the data should be a strong stationary sequence. However, observing the data trend in Fig. 3, especially when it comes to the sign of acceleration, shows a noticeable seasonality feature which means the data experiences regular and predictable changes that recur every specific interval. And the piecewise pretreatment process can capture this feature to improve the results.
The detailed comparison of different prediction methods introduced above is demonstrated in Fig. 9. The predictive value could move towards the opposite direction in the basic MC method, which might cause a harmful influence on the further decision. The results from the MCMC method are compelled to be average with many simulation samples. The improved MCMC methods can overcome the shortcomings above and show a satisfactory performance without obvious fluctuations. Moreover, in terms of the inflection points, the SMCMC can adhere well to the real velocity trajectory due to the prediction of the acceleration sign.   monotonic deviation can be limited. The average prediction error can be improved by more than 50% by the proposed piecewise methods due to the elimination of mutual interference. Also, the RMSE of that forecasted by the SMCMC method is merely about 1 km/h, which is acceptable enough in practice.

III. ENERGY MANAGEMENT BASED ON SMPC METHOD A. DMCP CONFIGURATION DESCRIPTION
The structure diagram of the aforementioned DMCP system with a 2-speed coupler is illustrated in Fig. 10, where an auxiliary motor (AM) and a traction motor (TM) are arranged at either end of the coupler, which is a planetary gear train (PGT) box. With an electrified bidirectional actuator, the PGT could work in two statuses, which is fixing the ring gear to the gearbox housing or locking the ring gear together with the carrier respectively. Both motors are the permanent magnet synchronization type and can operate in electric motor or generator mode alternatively. Since these two propulsion units are placed in a coaxial series way, they could output and intensify the traction torque on the driving axle at the same time. That is to say, the total torque demands of the vehicle could be responded to by the cooperation of the two motors and the status of the PGT, by which the comprehensive power performance can be further improved. The target vehicle in this study is a lightweight 12-meter bus, and the main parameters are listed in Table 2. The critical problem in the DMCP configuration is to determine the output power of the two motors and the working status of the PGT, respectively. To describe this problem, the longitudinal dynamics of the bus is considered, and the mathematical model can be expressed as follows.
where T wh,req is the total required torque at wheel side, T TM and T AM is the torque of TM and AM respectively, η d and η gear g is the general efficiency of driveline and PGT respectively, i g and i a is the gear ratio of the PGT and drive axle respectively, k p is the characteristic factor of PGT, T ebrk,max is the threshold value of the torque capacity in braking regeneration, T brk is the total mechanical braking torque on the wheels.
where r is the dynamic radius of the wheel, m is the bus mass, u is the bus velocity, δ gear is the coefficient of rotating mass in a different status, g is the local gravity, α is the slope, f r is the rolling resistance factor, A w and C D is the frontal area and air drag coefficient respectively. According to the operating principle, the rotation speeds of TM and AM are both relative to the wheel speed corresponding to the bus velocity. Focusing on the energy management problem, an efficiency model of TM and AM realized by lookup tables can be established, where the electrical consumption can be determined by the current rotation speed and output torque [34]. Besides, a simple but effective equivalent open-circuit voltage model with internal resistance can be adopted to describe the energy performance [35].

B. OPTIMIZATION PROBLEM FORMULATION
The optimization goal of DMCP energy management strategy is to find the optimal power split between the two motors and the proper operating status of the PGT coupler. Meanwhile, the real-time driving torque request and physical constraints should be responded to appropriately. As introduced before, SMPC is suitable for such online instantaneous optimizing problems. It can search for the optimal control sequences in a specific finite horizon derived from the prediction model and then repeat the rolling optimization and updating of the state variables in each step. The necessary online solving procedures are as follows: 1) Predict the acceleration sign by the trained NARX neural networks within a proper finite horizon h p ; 2) Select the relative velocity state transition probability matrix according to the current acceleration sign, then predict the velocities by SMCMC within the same horizon h p ; 3) According to the prediction results and bus current state (e.g., state of charge, current gear), a proper optimization algorithm like DP can be implemented to calculate the best control sequence over the finite horizon considering the constraints. 4) Apply the first value in the predictive control sequence as the current control command to the vehicle, then update the history data in the prediction models and repeat the procedures (1) to (4). The general structure of the strategy above is summarized as shown in Fig. 11. Like all the optimization problems, we need to formulate the target cost function, the relevant state variables, control variables and necessary constraints. The discrete model can be described as: where x k is the state variable including state of charge (SOC), the output torque of two motors and current gear status of PGT as x k = [SOC k , T TM , T AM , gear k ], u k is the control variable including power split ratio (PSR) and shift command as u k = [PSR k , shift k ], w k is the stochastic torque demands with disturbance as w k = T req . Therefore, the cost function in a certain period can be formulated as: where the power consumption of battery P bat , the amplitude variation of the motor torque T * , and shifting frequency gear with an adjustable penal factor ϕ * are considered.

C. ROLLING OPTIMIZATION SOLVING BY DP
Dynamic programming can be used to solve the online rolling optimization problem since the entire velocity profile is available in the finite horizon h p . After implementing the discretized model, as shown in (8) to the cost function, the target to be minimized in each step can be determined as: According to the DP algorithm, (10) can be solved by deriving to recursive subproblem sequence and optimizing them backward then applying forwards to get the final optimal results [25]. Namely, at step t k = k + m (0 ≤ m < h p ), the subproblem can be regarded as: w(k + m)) + J * k (k + m + 1)] (11) Additionally, to obtain the applicable control command regarding the engineering concerns, some reasonable constraints should be complied with in the solving process as follows: where ω * ,k and T * ,k is the current rotation speed and the output torque of TM or AM respectively, shift k is the shifting command of the PGT, τ mot is the upper limit of the torque response rate and l g is the rational threshold of the time interval between each shifting process.

IV. PRACTICAL APPLICATION WITH DRIVING CYCLE RECOGNITION A. ANALYSIS OF VARIOUS DRIVING CYCLES FOR INTERCITY BUS
The application of the method above is under the premise that the target bus is always running over a deterministic route, i.e., it is merely feasible for the situation of a city bus without additional conditions. However, to meet the needs of intercity buses and avoid the uncertainty with adverse effects from unknown cycles, the driving cycle recognition (DCR) should be conducted in advance. Like general recognition problems, the critical factor of the DCR process is to find typical features that can distinguish the cycles from each other. Many researchers are inclined to use the preset maximum velocity as the indicator experientially [36]. However, such an instantaneous single value is not reliable from an engineering perspective, and it is hard to design appropriate nonlinear criteria.
To determine the typical features reasonably, we analyzed the typical driving cycles of transit buses as shown in Fig. 12, where the dotted lines are used to show the most remarkable peak value of each area. Since the driving behavior of buses is quite steady in most of the situations (e.g., periodic stops in cities, changeless velocity in highways), it is rational and appropriate to recognize the traffic scenarios. Also, the MCMC process is only sensitive to the change of velocities. Hence, we can classify the driving cycles into city, suburbs, and highway groups according to the distribution of velocity and acceleration respectively. Here the velocity distribution is explicit and the general peak points, as the green lines illustrated in Fig. 12, constitute a series of thresholds to distinguish the cycles. Moreover, the acceleration distribution also presents a noticeable difference between the highway and the other driving cycles. Therefore, these features can capture the driving cycle patterns accurately and can provide quantitative criteria for the DCR problem.

B. DRIVING CYCLE RECOGNITION BY FUZZY LOGIC
As analyzed above, the task is to determine the driving cycle scenarios by proper classification parameters. Hence, the fuzzy logic control (FLC) could be an effective method to establish a link between these two items. The basic idea is to convert the input to fuzzy variables so that they can be used to distinguish the various patterns by preset fuzzy rules. The procedures can be summarized as four main steps.
First, the crisp variables, which take on a precise input as opposed to the fuzzy membership between 0 to 1, should be converted into fuzzy values and each of them is assigned a linguistic label by the membership function. This procedure needs high human expertise, which is one of the drawbacks as the accuracy depends on the fuzzification knowledge greatly. However, according to the analysis results in Fig. 12 the fuzzification boundaries can be determined easily, as shown in Fig. 13, whose key parameters and trends can be obtained from the peak points, intersected points, and distribution features, respectively. Second, formulate the fuzzy rule database by assigning a relationship between fuzzy inputs and desired outputs. The rule database is displayed in Table 3, where [Low, Middle, High] and [Flat, Gentle, Sharp] is the fuzzy description of the velocity and acceleration features, respectively. The numbers are the additional weights which can be tuned manually to refine the outputs accurately.
Third, locate the fuzzy output and merge them by applying fuzzy approximate reasoning. Here the reasoning process is based on the Mamdani fuzzy theory since the desired output is a certain numerical result. From the rules established in step 2, a corresponding fuzzy relation matrix can be obtained, as shown in Fig. 14. Finally, the defuzzification process is initiated to form the desired crisp outputs. Namely, the fuzzy variables results should be further converted to the familiar driving cycle candidates according to the maximum membership principle.
Besides a rational FLC model, the online monitor is another crucial factor that will influence the effect of practical application, which is often ignored and set intuitively. Generally, when the bus is running, it is easy for the vehicle control unit (VCU) to monitor and record the speed, and then calculate the mean velocity and acceleration at each discrete step size. A time-based monitor is adopted in the previous research, whose macro time scale is set to 100 seconds empirically [36]. However, such formulation is unreasonable when the possible driving cycles have huge differences between each other. It is noteworthy that the energy consumption, which is the goal of the optimization, is significantly related to the distance rather than the time. Besides, the stochastic stop time, especially in city cycles, could badly confuse the time-based monitor. Therefore, an improved distance-based monitor is proposed to overcome the disadvantages above.
As performed in Fig. 15, two typical driving cycles with different features are taken as an example to show the exact effect of the monitors. It is obvious that the time-based monitor with a 100-second scale may be interfered by the stops and is unnecessarily too big for the highway situation resulting in inevitable energy loss. Namely, such a solution will be caught in a dilemma that a uniform time scale is hard to be determined. It should be noted that after transforming the variables from the time domain to the distance domain, all the characteristics can retain perfectly, and the stochastic influence of stop time can be eliminated. Moreover, in the distance domain, it is easy to set a proper distance scale to balance the accuracy and energy loss. To reduce the online information storage and calculation burden, here a 500-meter distance scale is suitable, indicating the FLC will function in every 500 meters forward moving.

V. SIMULATION RESULT AND EVALUATION A. PREDICTION RESULTS OF PRACTICAL DRIVING CYCLES
As mentioned above, the operating principle of the online adaptive SMPC-based energy management strategy consists of four main parts. First, collect and analyze the original data from a statistical viewpoint to formulate the basis of the prediction and recognition model. Second, distinguish the scenarios of the current driving cycle by FLC-based DCR model. Then according to the current driving cycle, choose the corresponding trained neural networks of acceleration sign prediction and state transition probability matrices of SMCMC to forecast the future velocity change in a finite horizon. Finally, using DP as the solver to obtain the optimal control sequence considering the constraints and repeat the process at each step. To validate the effect of the proposed method, here a combination of various driving cycles is established, as shown in Fig. 16, assuming that the transit bus runs between the city and suburbs.
The driving cycle recognition results are illustrated in Fig. 17 in the time domain. The accuracy of the recognition seems excellent in terms of city and highway scenarios, where the driving features can be captured by the FLC model correctly. It is notable that partial results in the suburb portion will be classified to the city group by mistake. After careful observation, we can find that these phenomena occurred when the velocity value in the monitor range is small containing stops. Considering the probability-based state transition process in MC, such deviations are acceptable and have no adverse impact on the further velocity prediction since the probability distributions are similar in these situations.  It should be noted that the constraints of the optimization mainly consist of the shift frequency and output torque whose response time is usually less than one second. Therefore, the prediction horizon h p could be set as 5 seconds, which is an adequate size and can avoid cumulative errors. The final prediction results can be found in Fig. 16. With the help of driving cycle recognition, all the predictive velocities can converge around the real velocities in each horizon like the results in single cycle prediction. In detail, some prominent deviations appear in the area where the velocity is fluctuating obviously, e.g. the reference value like that at around 200 seconds in Fig. 16 HWFET highway cycle. The main reason is that at these points, the acceleration is changing frequently surrounding zero with low amplitude, which means it is hard for the neural networks to reveal the sign of acceleration. However, such occasional errors could be covered due to the rolling optimization and feedback effect of SMPC.

B. CONTROL STRATEGIES VALIDATION AND COMPARISON STUDY
To validate the energy-saving performance, an offline DP based strategy and a preliminary online rule-based strategy is adopted, respectively. Fig. 18 demonstrates the SOC variation trends of these three strategies. It is obvious that the DP based result ranks first since it can search for the global optimal solution assuming all the information is known in advance. Also, the result from the proposed SMPC strategy approaches to the benchmark of DP closely and seems much better than the preliminary strategy.
To show the practical implementation ability, the detailed output torque of the two motors and gear status of PGT are illustrated in Fig. 19. We can see that the gear shifting frequency is constrained appropriately with the minimum interval of 10 seconds. In the city condition, the gear is optimized to retain at low to handle the low speed and high torque demands of frequent start-up situations. And in the highway scenario, the gear is set to high to make it available for flexible torque allocation of the two motors. Moreover, there is no extreme output torque, namely the two motors can cooperate harmoniously to propel the bus, which is the expected benefits of DMCP.  Table 4 gives a quantitive comparison of the results from different strategies. Considering online application performance, the improvement of the SMPC strategy is noteworthy where the SOC can save approximately 2% in the total round trip. That means the SMPC based controller can make the mileage of the transit bus extend by about one-third of the whole trip with a full charge. The intuitionistic reason for the improvement can be found in Fig. 20. The rolling optimization function in SMPC can successfully manage the working points to concentrate on the most high-efficiency area to achieve a better energy consumption performance. Namely, the proposed strategy can take full advantage of the potential of DMCP system in any practical driving cycle.   21 illustrates the results from a statistical perspective, which can give a more objective evaluation of the effect of different strategies. To eliminate the interference from the original motor efficiency, the max motor efficiency, mean motor efficiency, and the mean efficiency of the working points are listed together to display the utilization proportion. We can see that the preliminary strategy is poor at exploiting the potential of both motors, whereas the DP strategy can reach higher efficiency than the average level. The SMPC can align the working points of TM to the high-efficiency area as possible with a sacrifice of partial utilization of AM. Considering the gear shifting constraints, as shown in Fig. 19, this phenomenon is because the gear status is fixed in city cycles, which restricts the allocation of AM. In summary, the proposed SMPC-based strategy can achieve excellent energy consumption results with the inclusion of online implementation facing various unknown driving cycles.

VI. CONCLUSION
In this paper, an adaptive energy management strategy based on SMPC is performed for a dual-motor battery electric bus. Various MC-based prediction models to solve the velocity forecasting problem are established and compared. The main findings and contributions can be summarized as follows: (1) After analyzing the statistical distribution features of typical velocity profiles, a novel signed MCMC with an acceleration sign prediction model by NARX neural network is proposed. The predictive results match the real velocity quite well, and the average RMSE performance is significantly improved by 59.82%, compared with the conventional MC method.
(2) A DP-based rolling optimization method under the SMPC framework is devised and implemented successfully for the DMCP system, considering the shifting frequency and motor torque dynamic response limits.
(3) Taking different driving scenarios into account, an FLC-based DCR model is built to identify the current driving patterns. A systematic design process of the core factors based on statistical analysis and a reasonable distance-based monitor are proposed, respectively.
The simulation results show that the SMPC strategy can handle the practical cycle combination correctly and can improve the energy performance by approximately 6% with an acceptable dynamic response compared to the preliminary rule-based strategy. Namely, the improved strategy can increase the electric-only range by 13.46 km on a full battery charge.
While the route profile of a transit bus is generally known ahead of time, calibration and experimental validation will still be required. Future work will focus on updating the state transition probabilities and the rules in FLC online adaptively to enhance the robust performance under various operating conditions.