Excitation Signal Design for Generating Optimal Training Data for Complex Dynamic Systems

The appropriate choice of excitation signal in system identification is an important but rarely considered part of the process that determines the success of many downstream activities. This paper presents a novel methodology for excitation signal design to create high accuracy multivariable nonlinear dynamic neuro-fuzzy models. Two different approaches to experimental design are investigated. In the first, a prescribed transient manoeuvre is used. In the second, informative potential is used to deconstruct the transient into a sequence of inputs designed to cover the same input space and reduce model development time. Star discrepancy is used to evaluate the resulting designs and is shown to provide a good proxy for excitation design quality. Results are presented showing the prediction accuracy of the model in terms of an application example, achieving a minimum < 2% cumulative error over a two minute transient. It is shown that the neuro-fuzzy models identified using data from the two different approaches have similar accuracy. However, the second approach based on informative potential leads to a more generalised model and reduces the development time by a factor of four. This is a significant result that shows the importance of choosing an appropriate excitation signal.


I. INTRODUCTION
N ONLINEAR dynamic system models that capture timedependent behaviour are often required for developing controls and optimising complex systems. A single model capable of capturing all the nonlinearities, particularly under dynamic operating conditions, will invariably have a complex structure and high order [1]. The types of model typically used include lookup table, physics-based, Mean Value Model (MVM), Neural Network (NN), Artificial Neural Network (ANN) or hybrid models. The successful control and optimisation of these systems depend on the suitability and the efficacy of the models used [2].
In many real-world processes, the physics are too complex or poorly understood to model using first principles and system identification techniques provide a convenient solution [3], [4]. The main drawback of data-driven models is that they are highly dependent on the information contained in the training data [1]. Excitation Signal Design (ESD) is a critical but under-investigated step in the development of these models. If the training data is not sufficiently rich in information, the model will extrapolate and perform poorly. Furthermore, if models are required in a control system, they must have the correct sensitivity to changes in control system parameters. Developing identified models of complex systems which have these characteristics is a non-trivial task.
Local Model Networks (LMN), also known as Takagi-Sugeno (T-S) Neuro-Fuzzy (NF) dynamic models [5], [6] have become increasingly popular for nonlinear system identification. They are a good solution for many applications that require fast running models e.g. control [7], optimisation [2] and virtual sensors [1], [8] and can learn high-dimensional nonlinear associations between inputs and outputs [1]. T-S NF models divide the input space into multiple smaller fuzzily defined regions, each associated with a local model, thus forming a collection of loosely coupled models [7].
This work presents a novel methodology for designing the identification tests and excitation signals to create highaccuracy multivariable nonlinear NF models of complex dynamic systems. Two ESD approaches are discussed and evaluated in this work. The first approach was developed to give good model accuracy over a specified transient. This approach applies different excitation signals to the inputs as the transient is repeated to give coverage of the full input range at each point. The second approach was developed to reduce the time required to gather training data and produce a more general model. This approach uses informative potential to deconstruct the prescribed transient into a sequence of inputs designed to cover the same input space. The main contributions of this work are: • The development of two different approaches to ESD for identification of nonlinear NF dynamic models. • The use of informative potential to generate an excitation signal that produces a more general model at lower experimental cost. • Use of star-discrepancy to evaluate the quality of and optimise an excitation signal. The process used in this work for creating NF models is presented in Figure 1 and begins with an outline of the requirements for the NF model, since these inform all subsequent stages. The requirements (fidelity, inputs, outputs, execution rate, etc) are typically derived from downstream processes.
The real system chosen to exemplify the ESD process in this work is that of a turbocharged direct-injection gasoline engine with Variable Valve Timing (VVT). The is a complex, highly non-linear, dynamic system. The models are to be used in a downstream process in which the aim is the minimisation (through mathematical optimisation) of transient fuel consumption and NOx emissions. This requires dynamic models of NOx and fuel consumption with sensitivity to VVT system Intake Valve Opening (IVO) and Exhaust Valve Closing (EVC) angles, fuel system Fuel Rail Pressure (FRP), ignition system Spark Angle (SA), and engine speed and torque. Figure 2 illustrates the non-linear behaviour of fuel consumption and NOx emissions as IVO and EVC in the VVT sub-system are changed. This shows the behaviour at a single speed and torque engine operating point. This behaviour changes with operating point, thus the system is highly complex. Modelling the time-dependent behaviour of this system is a significant challenge.
The NF models are to be executed at 50ms and the model accuracy requirements are: • Integrated total fuel consumption < 2.0% error • Integrated total engine out NOx emission < 4% error. This paper is organised as follows. Section 2 outlines the requirements of and two approaches towards excitation signal design. Section 3 describes the experimental setup. Section 4 details the selection of the NF model structure and the model training process. Section 5 presents the model validation results and compares the performance of the models trained using the two different excitation approaches. Section 6 presents the conclusions.

II. EXCITATION SIGNAL DESIGN
ESD is a critical step in the development of all types of data-driven models since the quality of the training data determines the accuracy of the models independently of the model structure [1], [9], [10]. The data used for training needs to contain enough information about the system [10], [11] to minimise model extrapolation [1], [10].
The first step in ESD is to establish the characteristics of the input signals, for example whether dynamic excitation is required or not [10], next is to understand the range of input signals. The system should be excited through all the input values that occur in real operation [10] to avoid model extrapolation. For linear systems a Pseudo Random Binary Signals (PRBS) suffices [1], [9], [10] and can be mechanistically generated [12]. However, for a nonlinear system the signal needs to excite all the frequencies and nonlinearities of the system and be persistently exciting in amplitude since the process gain and dynamics depend on the operating point [10]. Examples of nonlinear system perturbation signals reported in literature include: Multilevel Pseudo Random Signal (MPRS) [13], [14] Amplitude modulated PRBS (APRBS) [1], [9], [15]. To determine the bandwidth of the system and the dominant settling time, stair case excitation tests can be employed [1], [17]. In general, the more training data that can be gathered the better the model but in practice the time available for experiments is limited.
Good coverage of the input space is required to ensure that extrapolation in the use of the models is minimised. Typically in a model identification exercise the number of experiments that can be undertaken and hence the length of any excitation signal is constrained by the time available or access to test hardware. Optimal use of the available experimental budget and resources is required. To achieve this and as a measure of the relative quality of excitation signals, the authors propose the use of Star Discrepancy, a method not used for this purpose previously.
Star discrepancy is commonly used to quantify how uniformly distributed a pointset is over a region [18], [19]. In this work it was computed in 4D i.e. IVO, EVC, FRP and time and provides a relative measure of the uniformity of the design, for a 4D hypervolume it is computed: where: N is the total number of points within a closed hypervolume

A. APPROACH 1 -A PRESCRIBED TRANSIENT
It is often the case that there is interest in identifying behaviour of a system over a prescribed transient for example, for optimisation. This has the effect of constraining the input space since the transient is defined in terms of one or more of the time constrained inputs, u * = f (t). There are two options in this case for ESD. The first is to excite u according to some design leaving u * to vary according to f (t), the second is to excite all the inputs, u and u * . The advantage of the first approach is that it is simple and intuitive with the disadvantage that for a longer Transient Manoeuvre (TM) the time taken to gather the model training data could become large. The advantage of the second approach is potentially a shorter experiment with the disadvantage that significant care has to be taken to ensure a good design.
In the first approach to ESD, a subset of the inputs are constrained to a defined trajectory. For the application example in this work the inputs of interest are IVO, EVC, FRP, SA, engine speed and torque. Two of these, speed and torque, were prescribed to follow an identical transient that was linearly scaled to represent different gear selections gear 0, gear +1 and gear -1 as shown in Figure 3. The IVO and EVC inputs to the system are coupled, their excitation was based on operation defined by the actuation limits as visualised in Figure 4 in which the region was divided into a total of 16 discrete points chosen at regular intervals to evenly cover the space. The excitation signal was then formed by transitioning between these 16 discrete points. FRP was treated differently as it is not coupled and also has well defined minimum and maximum limits. The FRP excitation was designed by applying a multiplier from 0.8 to 1.0 to the baseline FRP 2D engine calibration map. This 20% effective range of the multiplier was first determined via a sensitivity study. The effective range was sub-divided into four discrete values so that at each of the 16 discrete (IVO, EVC) points ( Figure 4), there were four FRP settings. In total, there were 64 (16x4) discrete actuator combinations for IVO, EVC and FRP.
The T 10−90 time for the VVT system was measured to be <1s using a step response test, this was used to segment the transient into one second sections, for a total of 120 segments. In each one second section, there were 64 possible discrete combinations of IVO and EVC and FRP which could be applied. Therefore, a total of 64 different perturbation signals were created and the selection of these applied in each interval. This was done in such a way that over all 64 VOLUME 4, 2016 repeats of the TM, every one of the 64 combinations was visited in each one second section. An example of one of the 64 perturbation sequences is shown in Figure 5. The IVO and EVC are shown as a 2D region extending in time, this illustrates how the excitation signal is a 4D design, contrasting simpler statically designed experiments. To assess the relative quality of the design the Star Discrepancy was calculated. Figure 6 shows the star discrepancy for all 64 perturbation sequences for ESD-1. The discrepancy has a scale 0 to 1, a smaller number indicates a more uniform distribution, with a minimum of 0.1017 and maximum of 0.2410. This result shows that the process of sub-division plus randomisation used to generate the 64 perturbation vectors is effective at producing a low discrepancy design, indicating good design coverage. In total, this ESD-1 was comprised of (3 x 64) 120 sec TM repeats (since there were three gear selections). All 192 ESD experiments were completed sequentially and automatically by the test system for consistency with a 15 sec period of stabilisation between each experiment. The total time required to complete all the experiments was approximately 7.5 hours.

B. APPROACH 2 -A DESIGNED TRANSIENT
A second approach was developed with the purpose of reducing the time required for collecting the model training data and also achieving more uniform design coverage for a more generalised model. In this second approach, an excitation signal was designed by varying all inputs i.e. u and u * . A scatter plot of the measured speed and torque of three scaled versions of the TM (gear 0, gear +1 and gear -1) is shown in Figure 7. The process used to generate the excitation signal has two steps. Informative potential was  used to identify 85 discrete speed and torque points from the the original transient, these were sub-divided into two groups denoted by the blue and pink circles and used to create two excitation designs with FRP multiplier of 0.8 and 1.0 applied respectively to each of the group of points. Informative potential originated from the mountain method first presented in [20]. The informative potential algorithm used in this paper is from Chiu [21]: where, x i and x j are the data points; N is the number of data points; r a is a positive constant. When r a is smaller, the points closer in distance to the current point will contribute more to the potential value, hence a greater resolution in the ability of additional points to inform is provided. To illustrate, Figure 8 shows the difference of potential surface using two different r a values on the same data set.
(a) ra=0.2 (b) ra=0.1 The excitation design was created from the most informative points using a random linear transition between points whilst constraining the ramp to the maximum possible for the system identified using prior tests, the resulting excitation sequence is shown in Figure 9 and the final time-based excitation sequence is shown in Figure 10. The engine speed and torque were held for two seconds at each discrete point to provide steady-state information for the model.  compares the ESD-1 and ESD-2 speed and torque demands. The advantage of ESD-2 is that it more uniformly covers the input space and the resulting model is much more general with good prediction capabilities beyond the constraint of the prescribed TM. In theory, any cycle which lies within the boundaries of the selected TM can be modelled with the NF models developed from ESD-2, thus making it a much more general approach. To provide a means of quantifying the uniformity of the two different ESD approaches, the star discrepancy was determined for the normalised engine speed and torque demands. Figure 12 shows the 2D intermediate star discrepancy functions S(x) (actual distribution) and S N (x) (perfect distribution) for ESD-1 (Figure 12a)  and ESD-2 ( Figure 12b). Comparison of Figure 12a and Figure 12b shows a greater difference between S(x) and S N (x) for ESD-1, i.e. ESD-1 has higher discrepancy. The star discrepancy for ESD-1 in Figure 12a is 0.3867 and for ESD-2 in Figure 12b is 0.1874, this lower value for ESD-2 confirms it is more uniformly distributed.
The ESD-2 excitation for IVO and EVC of the VVT system uses a similar approach to the ESD-1 design. The IVO and EVC operating range was divided into 13 discrete points with a single point at the centre and 12 points distributed evenly around the boundary, this is analogous to a Box-Behnken experimental design. The fewer discrete IVO and EVC points (13 as opposed to 16) and the different design, was conceived for more efficient excitation giving the same coverage whilst enabling a shorter excitation test. In total, ESD-2 was four experiments of roughly 25 min duration each and took just under 2 hours to complete, compared to ESD-1, this was a four-fold reduction in test time.

III. EXPERIMENTAL SETUP
The gasoline engine (more details available in Appendix A) used in this study is rated at 120 kW and is connected to a 220kW AVL Schneider Electric transient dynamometer. The fuel consumption was measured with a Sentronics FlowSonic LF ultrasonic fuel flow sensor [23] which is capable of a measurement rate of up to 2.2kHz. Exhaust NOx was measured with an ECM 5240 NOx analyser [24] which has a response time of <1 second. Figure 13 presents a schematic of the measurement system. Prior to testing, an effort was made to ensure sufficient levels of repeatability were achieved. For 10 repeats of the TM, the COV of total fuel consumption was < 0.4%, and the COV for total exhaust out NOx emissions was < 1%. All the required signals were recorded at 100Hz, this is five times faster than the subsequent model execution time to allow filtering and sub-sampling to help improve the model training result.

IV. NEURO-FUZZY MODEL STRUCTURE AND TRAINING
Two separate NF models were identified, one for fuel consumption and the other for engine out NOx. VVT, IVO, EVC and FRP were chosen as model inputs as were SA (spark angle) and lambda (actual air-fuel-ratio/stochiometric air-fuel-ratio). Figure 14 shows the model structure for both the fuel and NOx models. Model input orders were selected together with the number of local models to satisfy accuracy requirements. The premise variables were selected from the model input set. To avoid unnecessary model complexity, only one delay time-step for each input was used as the premise variable. A total of seven premise variables were used for both the NF fuel and NOx models. Before model training, a Savitzky-Golay filter was used to remove high frequency noise from the fuel consumption data. The NF model structure used was a discrete nonlinear dynamic model of the form: where, y(k) is the predicted value at the current sampling point. The terms u Ne , u Te , u ivo , u evc , u f rp , u λ , u spk represent the regressors from seven inputs. A Gaussian function was used as the membership function. The local model is a linear AutoRegressive with eXogenous terms (ARX) model augmented with a constant term. Thus, Equation 3 can be rewritten: where, M is the number of local models; w i is a row vector of local ARX model parameters; the consequent inputs, x(k) = [1 u Ne u Te u ivo u evc u f rp u λ u spk ] T is a column vector of regressors formed from a constant and the delayed inputs. The premise inputs, z(k) = [z 1 (k), z 2 (k), ..., z nz (k)] T is a vector formed using one delayed value of each of the seven model inputs as shown in Figure 14.
The LOcal LInear MOdel Tree (LOLIMOT) algorithm was used for the training of the NF models [22].

A. ESD-1 NF MODEL VS. VALIDATION DATA
For the model validation tests, a sequence of 76 transients were run with the gear 0 scaling, the calibration map used to determine the VVT system behaviour was swept through a range of offset values to generate the validation data. Figure  15 shows the performance of the ESD-1 gear 0 fuel model for seconds 50 to 75 of two different validation tests. The most significant acceleration event occurs with torque reaching roughly 70% of the peak. Comparison of Figure 15a with Figure 15b shows how the VVT calibration map has affected the fuel consumption between 62-67 seconds and how the NF fuel model successfully predicts this behaviour. The same comparison is shown for the NF NOx model in Figure 16. This compares the ESD-1 gear 0 NOx model to the measured NOx emissions and shows that the model captures the dynamic behaviour well. The oscillation in Figure  16 for both the measured and predicted NOx emissions is caused by rich/lean perturbations designed into the system to improve the performance of the exhaust catalyst. The effect of this on fuel consumption is relatively small and is therefore less evident in Figure 15. The change in the VVT calibration map between Tests 3 and 43 results in a slight reduction in NOx which the NF model predicts. Interestingly, at around 57 seconds the peak is reduced significantly and the model predicts this change in dynamic behaviour well. Generally, it can been seen that the NOx model captures the dynamic behaviour, though perhaps not as well as the fuel model. Part of the explanation is the effect on NOx from the rich/lean (i.e. lambda) perturbation of the Engine Control Unit (ECU) strategy. From Figure 16 there is evidence that the NOx model extrapolates at the extremes, this is potentially due to the range of the training data available as lambda was not included in the ESD. The addition to ESD-1 of a purposely designed lambda perturbation signal with a greater amplitude and variable frequency range, would likely improve the model accuracy, however this was not investigated in this work.
To examine the instantaneous output model accuracy for the validation data, the error between the model output and the measured value was calculated for each data point in the 50ms re-sampled measured data. The units of calculated model error in the RMSE calculation were g/min for fuel and ppm for NOx. This created a vector of model error for each transient which was then used to calculate the Root Mean Square Error (RMSE) for the NF fuel and NOx models for each gear profile. This result is presented in Figure 17 for the ESD-1 fuel and NOx models. From Figure 17a it can be seen that the RMSE of the gear 0 NF fuel model is lower relative to gear +1 and gear -1. The RMSE fluctuates from test to test and this is expected. For example, the coefficient of variation of measured total fuel consumption can vary by up to 0.4% due to small fluctuations in uncontrolled experimental parameters and also small changes in the engine control system from test to test. The RMSE of the NOx models ( Figure 17b) shows a trend-wise change in RMSE over the 76 tests, the NOx models do not perform as well as the fuel models and the gear 0 NOx model does not perform better than the other two models. Figure 18a shows that all three ESD-1 models under predict the fuel consumption between 0 to 2% with a difference between the models of roughly 1%. The fuel models are most  accurate for Tests 29-48. The total NOx over predicts in most cases with the greatest error of nearly 7% for a VVT offset of +1, otherwise the NOx model error is typically below 5% and within the required 4% accuracy required.
(a) % error Fuel (b) % error NOx  Figure 19 compares the ESD-2 NF fuel model to the validation data for validation tests 3 and 43. Comparing Figure  15 with Figure 19, it is observed that the ESD-1 gear 0 NF fuel model and the ESD-2 model show very similar performance for these two validation tests. The equivalent result for the ESD-2 NF NOx model is presented in Figure 20 and compared to the ESD-1 model result in Figure 16. The ESD-2 NF NOx model performs slightly worse at predicting low NOx between 50 to 60 seconds compared to the ESD-1 gear 0. For conditions with higher NOx concentration, the two models are very similar. The RMSE of the instantaneous error for the ESD-2 fuel and NOx models are shown in Figure 21. Compared to the ESD-1 fuel model RMSE (Figure 17a), the ESD-2 fuel model RMSE is similar, especially for the gear 0 ESD-1 model. For the ESD-2 NF NOx model, the RMSE is slightly worse when compared to the ESD-1 model performance. Interestingly, there is good trend-wise agreement in RMSE with validation test between the ESD-1 and ESD-2 models.   Figure 22 shows the ESD-2 model percentage error for the integrated fuel and NOx NF model outputs vs the validation data. The trend in the ESD-2 NF fuel model percentage error with test number is very similar to the trend observed for the ESD-1 models in Figure 18a, however the ESD-2 model is overestimating fuel consumption, which is opposite to the ESD-1 model behaviour. The percentage error is overall slightly greater than for the ESD-1 models but is within the 2% required for more than half the validation data. For NOx, the ESD-2 NF model percentage error has a similar trend to the ESD-1 models however the overall prediction accuracy for total NOx is improved for the ESD-2 model and falls with the 4% requirement for all of the validation tests.

B. ESD-2 NF MODEL VS. VALIDATION DATA
Finally, the total fuel and total NOx for the validation data vs. the integrated model output is shown in Figure 23. Here the 2% error above and below the validation measured result is shown. This shows that both NF fuel models are nearly always within 2% error, with the ESD-1 model performing best. For total NOx the ESD-2 model performs better but both models could be improved. Importantly, although there are some specific differences, the performance of the ESD-1 and ESD-2 derived NF models are comparable overall.
Since the validation data is based on gear 0 scaled versions of the TM, this indicates that the ESD-2 design approach, when implemented well, can result in models with equivalent performance to models developed using the ESD-1 approach.

C. ESD-1 AND ESD-2 VS. ALTERNATIVE 2 MINUTE TM
To evaluate the generalisability of the ESD-1 and ESD-2 NF models, the models were compared to measurements for an alternative 2 minute TM shown in Figure 24. Figure 25a and Figure 25b show the model output vs. measurement for the Fuel and NOx models respectively. The performance of the ESD-1 and ESD-2 fuel models are very similar with the models both capturing the overall behaviour well but less well at the limits of operation. The accuracy of the NOx models differ more, the ESD-1 NOx model predicts three short period peaks in NOx emissions where none are measured and where the ESD-2 model is more representative of measured NOx. These peaks are coincident with negative torque and are therefore outside of the training data input space for both ESD-1 and ESD-2. Otherwise the two NOx models are broadly similar and in several sections both models predict greater NOx emissions than were measured.  Finally, Figure 26a and Figure 26b show the cumulative integral of fuel and NOx emissions for this alternative TM. Figure 26a shows similar performance for both fuel models, with the ESD-2 based model slightly more accurate. The performance of the ESD-1 and ESD-2 NOx models were poor relative to the fuel models, both models predicting higher NOx than measured. The ESD-2 NOx model was slightly better largely due to the inaccurate short period NOx peaks predicted by the ESD-1 model at negative torque. In summary, the ESD-2 derived models performed better over this alternative TM and were more robust in situations outside of the original ESD boundary. VOLUME

VI. CONCLUSION
This work presents a novel methodology for designing the excitation signals for collecting dynamic training data to create accurate NF models of complex multivariable nonlinear systems operating transiently. Two alternative ESD approaches are presented for capturing the dynamics of the system. The first uses multiple repeats of the transient with time-based excitation signals. The second uses informative potential to deconstruct the original transient to cover the operating region of the original transient with a substantially reduced test time requirement.
Star discrepancy is used to optimise the ESD in the first approach and is also used to quantify relative design quality in terms of coverage. Evaluation of the NF models over an alternative transient showed the second ESD models performed better and were more generalisable.
Of the two ESD approaches, the second using informative potential leads to improved NOx model accuracy and only slightly decreased fuel model accuracy. A key benefit of the second approach is the reduced time required to collect the model training data, from 7.5 hours to 2 hours producing a more time efficient design.
Finally, the results show that using the described ESD approaches, the engine total fuel consumption for the 2minute transient can be modelled to an accuracy of < 2%. The engine out NOx emissions are predicted to an accuracy of < 4%. The NOx model accuracy could likely be improved through the inclusion of air-fuel ratio (i.e. lambda) in the ESD, as NOx is highly sensitive to lambda, however this was not investigated in this work. Analysis of the RMSE of the NF models compared to the validation data shows that the NF fuel model performs well at predicting the impact of large artificially generated shifts in the control system parameters. .

APPENDIX A ENGINE DETAILS
The details of the experimental engine used in the study in this paper is listed in Table 1