Methods for Virtual Validation of Automotive Powertrain Systems in Terms of Vehicle Drivability—A Systematic Literature Review

For the last two decades, an extensive transition in automotive X-in-the-loop activities from isolated electronic control units to real-time related, geographically distributed validation tasks has occurred. Benefits are strengthening frontloading, enabling concurrent engineering and reducing prototypes and testing efforts. As a downside, comprehensive system understanding and adequate simulation models must be provided. New technological trends like software-over-the-air-updates denote a continuous validation process even after the start of production. The present review focuses on the virtual validation of vehicle longitudinal dynamics. This exemplary field of application receives more and more attention as electrification of the vehicle powertrain accelerates, and this property directly influences the vehicle DNA. A systematic review process based on the PRISMA workflow has been conducted, focusing on drivability-related powertrain applications. The investigation reveals the following trends: First, increasing complexity of virtualisation methods and models for validation activities influenced by vehicle-to-everything and geographically distributed development. Second, missing standards for virtual validation and proof of representativeness for combined real-virtual testing. In addition, many studies only contemplate the advantages of hardware-in-the-loop-driven development, disregarding crucial limitations and risks for such approaches. In conclusion, there is no longer the question of whether to validate virtually but how to comprehensible realise virtual validation.


I. INTRODUCTION
The automotive industry has been subject to significant changes like the deployment of control units, advanced driver assistance systems or electrification of the powertrain. From the point of the vehicle development process, one notable trend is the gain of software functions in the vehicle. An exponential increase in terms of internal combustion engine (ICE) control unit parameters from 85 in 1980 [124, p. 11], over 10,000 in 1990 [124, p. 11] up to 30,000 in 2017 [135] substantiates this tendency. The number of control units in The associate editor coordinating the review of this manuscript and approving it for publication was Hassan Omar . a car amounts to more than 100 [42, p. 2] respectively 125 [113, p. 2]. Similar to smartphones, the innovation in vehicles is often credited with software functions. Recent studies claim an rising impact on innovation by those from 20% up to 80% in recent years, citing forecasts of about 90% innovation potential within the end of this decade [28, p. 485], [42, p. 129]. Innovation demonstrates an advantage of a product and influences market shares and overall competition. The automotive future will be autonomous, connected and electric [158, p. 33].
There are varying appraisals on the costs of state-of-theart vehicle development. Deicke [42, p. 3] finds a double of software extent all two to three years and an amount of circa VOLUME 11, 2023 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ 50% -70% of development costs of control units by software. The overall development expenses due to software design and testing is assumed to be between 50% [113, p. 6] and 70% [42, p. 129]. Those statements support the approaches in vehicle development to enhance the degree of agility. Today, the vehicle development process, generally illustrated by the V-model, cannot be changed immediately to a typical software development process like Scrum since there are too many dependencies between hardware and software functionalities and car manufacturers and suppliers. In automotive software development for control units, there are already some established virtual validation platforms. At this regard, Deicke [42, p. 26] defines virtual validation as an integration and quality assurance of embedded code from series development into a generic system independent from the target hardware. The aim is to ensure code reusability and early identification of failures. Typical fields of application are the qualification of basic software, development-attendant testing, and hardware-independent integration [42, p. 31]. In order to fulfil these requirements, a key premise are software standards like AUTOSAR [13], [61] or COVESA (former: GENIVI) [38].
Even though virtualisation of the vehicle development process has significant benefits like reduction of real prototypes and saving time and costs, there are primary downsides. Virtualization at test beds demands meeting real-time requirements and automation targets for optimal test bed operation grade. Those claims limit the simulation model's ideal handling and deployment in terms of a trade-off between computation time and accuracy of results, availability of model components or reliability of closed-loop performance [44, p. 44]. Surrendering complex vehicle-level test beds like chassis dynamometers would significantly impede the development of innovations, knowledge build-up, and troubleshooting [44, p. 128].
A study from Kalra and Paddock [88] shows a statistical assessment of the required miles needed for autonomous vehicle testing based on a fleet of 100 autonomous vehicles driving 24/7 at an average speed of ca. 40 km/h (see Tab. 1). For a high-reliability target in such a validation task, the amount of testing is not realisable within automotive development without virtual validation [46]. Belbachir et al. [19] support this statement and estimate several million kilometres for advanced driving assistance systems (ADAS) validation.
One exemplary field of application for the virtualisation initiatives mentioned above is drivability. Drivability refers to a driver's subjective evaluation of the vehicle's response to his inputs. It stands in a conflict of objectives with other vehicle characteristics like driving dynamics, ride comfort, durability and efficiency. Liu et al. [112, p. 2] assign a pivotal role to drivability in determining the vehicle brand and DNA. Jauch et al. [81, p. 300] claim a special consideration of drivability in the context of hybridisation and full electrification of vehicle powertrains. Many operation strategies in case of more than one propulsion unit integrated into a vehicle give more degrees of freedom for vehicle design and construction. New challenges like (dis-) engagement of secondary propulsion units operating at a secondary vehicle axle without direct mechanical coupling by prop shaft are assumed to shape future drivability calibration tasks [81]. A car manufacturer defines a requirement specification for each model at the beginning of each development process. The requirement specification and related vehicle properties are crucial to fulfilling legislation and customer demands. Software features are becoming more important for cars in the future and are about to replace recent vehicle characteristics like the number of combustion engine cylinders. The authors expect vehicle drivability to help manufacturers get the vehicle model to stand out better against competitors, especially for electric vehicles.
Drivability is supposed to be most influential to the overall human driving experience of the car [140, p. 573]. According to Zehetner et al. [166, p. 2], feelings of confidence and safety are caused by a positive driving event and thus directly linked to drivability design. Although vehicle drivability is mainly perceivable at a frequency range of about 2-10 Hz [74, p. 6], trends like electrification of the powertrain cause an increase to a frequency range of up to 20 Hz [63, p. 11]. Some motorsports applications already show frequency ranges up to 50 Hz due to increased stiffness in powertrain components. Riel et al. [140, p. 575] collate the frequency range below 100 Hz of longitudinal vehicle dynamics as crucial for drivability and ride comfort.
The present paper is structured as follows: First, the methodical approach utilizing the PRISMA workflow is described. Then, the field of application for virtual validation, vehicle drivability, is explained in detail, focusing on state-ofthe-art phenomena analysis, modelling and evaluation. The disadvantages of established drivability validation methods are pointed out. Moreover, we put drivability development into the context of the latest virtual validation methods, from single components (like batteries) to complex scenarios like vehicles driving in a virtual world. Here, we consider test cases where a device is under test with virtual components creating a semi-virtual closed-loop. X-in-the-loop gives a general phrase for such devices at different levels of complexity. We evaluate the benefits and disadvantages of hardwarein-the-loop (HIL) methods as a starting point for virtual validation in vehicle development. Finally, the current benefits, risks and challenges of virtual validation are summarized for future research work from different perspectives. The key research questions for the present systematic review are: 1) What is the current state-of-the-art in automotive drivability development? 2) Which trends and developments are recognisable in automotive X-in-the-loop methods? 3) What are the opportunities and threats regarding virtual validation of drivability-related vehicle properties?

II. METHODICAL APPROACH
A systematic approach based on the fundamental concepts of PRISMA workflow characterises the literature research process [130]. Scopus and Web of Science are used to gather the first literature research data set. In addition, other search tools like the Google Scholar search engine are utilised to append specific literature to the primary data set. In general, the search filter includes a restriction of literature between 1994 and 2022 to catch the most relevant period for powertrain-related validation methods. The first comprehensive dissertations published in the field of drivability support this claim [15], [22], [56], [111]. Nevertheless, specific literature was added later in exceptional cases to provide essential information to the presented research queries (such as [53]). The initial search, carried out on October 10th, 2021, on Scopus used the following search string:

TITLE-ABS-KEY ( powertrain OR driveline OR drivetrain AND vehicle OR automotive AND test OR in-the-loop OR validation ) AND PUBYEAR > 1994
All results during the PRISMA workflow's identification, screening and inclusion phase are presented in appendix A ( Fig. 7-9 and Tab. 4). The obtained literature data set from the identification phase has been screened by adding additional keywords or modifications in the boolean operators of the search string like validity or objectification. Afterwards, the records assessed for eligibility were screened manually, and reasons excluded some. An example of exclusion for most records is if a study does not belong to the automotive sector or is not concerned with validation or testing activities. The total number of studies finally included in this systematic review amounts to 169. A histogram showing the publication year distribution is presented in Fig. 1.
Amongst others, two literature reviews on hardware-inthe-loop systems have been analysed as well [27], [58]. The first review by Brayanov and Stoynova [27] reviews the development of hardware-in-the-loop methods in general. The focus is on HIL setup, architecture and classification. A definition of a HIL system is presented, and the review concludes with requirements and challenges for HIL systems. The second review by Fathy et al. [58] analyzes HIL systems in the automotive industry. Critical enablers for automotive HIL systems are discussed. The main focus here is on enginein-the-loop applications exemplary used for emission measurements.
In contrast to those two reviews, the present paper discusses virtual validation in the context of vehicle drivability. Here, HIL systems are part of the virtual validation step, supported by objective evaluation of human perception and automation. Drivability is explained in detail to derive the requirements for a virtual validation method. Limitations of state-of-the-art HIL validation methods in the automotive sector are considered for discussion on a potential drivability adaptation. The present paper discusses the strengths and threats of virtual validation for automotive powertrain applications in particular. In summary, both literature reviews are used to build a fundamental HIL understanding but are extended by in-depth drivability knowledge and related validation methods.

III. DRIVABILITY: AN EXEMPLARY AREA OF APPLICATION FOR VIRTUAL VALIDATION
A driver has various options for controlling the vehicle using actuators like the steering wheel, throttle or brake VOLUME 11, 2023  pedal. Additional interfaces have been established in recent years, like drive mode selection affecting vehicle response behaviour. The interaction between driver and vehicle is considered a human-machine-interface (HMI) when the driver perceives the vehicle's response to his inputs subjectively. That response can be classified, for instance, by the type of triggered perceptual channels (e.g. visual, auditive, haptic) or the dynamic behaviour expressed in a frequency range.
In this context, drivability refers to the subjective feeling of the vehicle's response to the driver's inputs focussing on vehicle longitudinal dynamics [53]. Concerning the increasing demand for control unit functions, so-called comfort functions are implemented to get an optimum trade-off between agile, spontaneous vehicle response, efficient powertrain usage, and convenient, steady driving. Examples of such comfort functions are Bonanza damping or anti-shuffle control [135, p. 8].
A. DRIVABILITY RELATED PHENOMENA Drivability phenomena are noticeable as vibrations or noise and mostly short-period specific events up to a few seconds. Table 2 shows essential phenomena and their frequently-related effective area in the noise, vibration and harshness (NVH) range.
During a vehicle load change, the first (powertrain) and third (transmission; typical frequency range: 51-55 Hz [69, p. 40]) torsional natural frequency of the powertrain are mainly excited (see Tab. 3). As a consequence, an initial jerk (shunt), vehicle shuffle and oscillation of the transmission shafts are caused [18, p. 5]. Subsequently, a change of mounting positions of backlash-affected components is conducted, ending in a metal-like sounding impact [148, pp. 13-14]. This impact phenomenon is considered as clonk. Biermann and Hagerodt [22, p. 59] demonstrate a dependency of the clonk noise on the sum of angular momentum of the transmission components. Although some authors [56], [69], [148] assign clunk to the clonk phenomenon, others [20], [74] outline this due to slightly different mechanisms. Clunk is a combination of high-frequent impacts and low-frequent settling of an oscillating powertrain. Higher gears engaged and spontaneous load changes are key premises for the clonk phenomenon. In contrast, clunk is observed in a disengaged powertrain state and requires a manual transmission setup [20, pp. 13-14]. By disengaging the clutch, the inertia of the motor and flywheel are decoupled from the residual powertrain, and the first torsional natural frequency increases. Contrary to this, a longitudinal oscillation of the vehicle body forced by clutch engagement is considered as judder. The third torsional mode of the powertrain contributes to the transmission rattle, a phenomenon of movement of backlash-affected components inside the transmission.
Shuffle is considered one of the most substantial phenomena in terms of drivability [56], [148] and therefore focussed in the present paper. It describes the first torsional natural frequency of the powertrain where the combination of the vehicle body and wheel-tyre-subsystem oscillates against the drive side (motor, flywheel, transmission) [148]. Shuffle often occurs after Tip-In, Tip-Out and driveaway events and subsides after four to five cycles [56], [59], [69], [168]. An approximate value for the shuffle frequency f sh is stated with the following formula [148, p. 10]: where c tot refers to the overall torsional stiffness of the powertrain subsystem, J M and J V mean the inertia of motor and vehicle and i tot is the overall powertrain ratio. For parametrization effort, the term i 2 tot J V can be neglected, and a fundamental estimation of shuffle frequency is derived.
It is imperative to correlate oscillation events to the correct root cause. Differentiation from other nearby natural modes of secondary components and subsystem has to be made through the dominant frequency range covered for drivability issues.

B. SENSITIVITY ANALYSIS AND MODELLING APPROACHES
Many researchers studied drivability phenomena such as shuffle and clonk. The following listing comprises investigations carried out at conventional front-wheel driven [56], [69], [148] and rear-wheel driven [20] cars in terms of main factors for shuffle: • Overall powertrain stiffness -with special focus on the clutch and side shafts due to their relative low stiffness [56], [148] • Motor inertia [56], [148] • Torque or accelerator characteristic (e.g. gradient, shape, time delay) [ simulation models have been developed and utilized to understand the physical chain of interactions corresponding to shuffle. Three groups of models can be generally distinguished in terms of accuracy and complexity demand. The first group embodies models with up to four degrees of freedom (DOF). These models are used mostly for rough estimation on lower powertrain natural frequencies (see [56], [135], [148]). Aside from that, a second class comprises models with up to about 9-15 DOF incorporating additional components like 1D engine, simplified tyre or suspension models. Such models are required for instance for powertrain controller design respectively analysis of interaction of various frequency modes (tyre, transmission, suspension) up to around 100 Hz [43], [52], [62], [69], [81], [148]. The latter group is characterized by simulation models with at least 20 DOF and a substantial multi-body simulation approach. Critical components are transformed from rigid to flexible state (side shafts, gears), and elastomeric bearings are introduced. The objectives cover a wide frequency range and gather an in-depth understanding of comfortrelated events [20], [34], [56], [67], [69]. The latter complex models take strong interactions between all corresponding subsystems into account [34, p. 2]. An in-depth analysis of suitable tyre model approaches in drivability is shown in [60] and [114]. More complex model approaches focus on tyre dynamics and pitch motions interacting with the powertrain oscillations due to their strong coupling at the tyre/road interface [89], [90]. Furthermore, in terms of non-linear phenomena like friction, linearized or look-up table approaches are deployed [18, p. 31]. Didcock et al. present a completely different approach by utilizing a conic (data) hull algorithm for drivability modelling [45].
Kollreider [97, p. 3] define two important parameters for simulation model performance evaluation. Reproduction quality referred to the divergence between measurement and simulated data and is determined by ordinary statistical equations. On the other hand, the depth of field means a qualitative characterization of the level of detail for reproduction of process-specific phenomena. Precise analysis of component behaviour in a total powertrain oscillation relationship presupposes this. VOLUME 11, 2023 Complex modelling of the transmission in terms of low-frequency drivability is not required. In increasing reproduction quality and depth of field, two shifting elements depending on the target gear are the output of a reduction of a transmission model [105, p. 30].
With the increasing amount of electrified powertrains, the design and validation of the same needs adjustment. Taking into account aspects like missing clutches, flywheels and -in some cases -the presence of torque-split propulsion units, the first torsional natural frequency of the system is typically increased from 2-10 Hz to a range of about 15-20 Hz [63, p. 11]. Additionally, new operation modes or mode change scenarios have to be considered like start-stop, regenerative braking, (dis-) engagement of secondary powertrains and boosting [62]. Recent studies indicate a potential increase in maximum electric motor speed due to considerations about efficiency, costs and lightweight design. Morhard et al. [122] show a concept of a 50,000 min −1 peak speed permanent magnet synchronous motor (PMSM) in combination with a 30,000 min −1 maximum speed asynchronous motor (ASM) mutually propelling one axle. The PMSM is mounted to a two-speed transmission with ratios 36 and 20.4. In contrast, the ASM is connected to a gearbox with a fixed gear ratio of 26.4. Those ratios contrast with transmission ratios of about 4.5-0.7 and differential ratios of 3.7-4.4, resulting in overall powertrain ratios of around 20-2.8 (the higher the transmission ratio, the lower the gear engaged). Following equation (1), this means a contrary tendency to the trend of a slightly increased first torsional natural frequency of the powertrain due to higher overall powertrain stiffness and reduced drive unit inertia. Future developments will demonstrate the primary first torsional mode of the powertrain, but it is assumed to be higher than conventional ones.

C. DRIVABILITY EVALUATION
Drivability plays a substantial role in the conflict of targets between driving dynamics, efficiency, operating strategy and durability. A car manufacturer can either change the constructive design or optimize the powertrain control strategy to achieve an optimized drivability characteristic. Both require understanding how driver and passenger perceive the longitudinal acceleration of the vehicle. Thereby, subjective evaluation of drivability has to be correlated to objective value as shown in [35], [69], [95], [111], [117], [118], [127], [136], [142], [147], and [151]. Alongside of instances for correlation studies related to passenger cars, there is an example for motorcycles presented in [55].
Many studies have been carried out to investigate a human's oscillation perception. Zhang et al. [169] proposes considering comfort and discomfort as two separate, independent parameters. Comfort means a pleasant, restful feeling, whereas discomfort describes pain and fatigue. Their relationship is illustrated by Hartung [70], who compares a driver of a sports car with one of a standard car. The oscillations perceived in the sports car are higher due to a fundamentally stiffer suspension design. A driver of such a car concedes these by stronger experiencing emotions caused by spontaneous vehicle response and superior driving dynamics. Bubb et al. [30] finally put the concept of comfort and discomfort in context with the comfort pyramid stated in [101]. This concept is shown in Fig. 2.
The comfort pyramid entails all personal comfort-related needs. If a requirement of a lower level is fulfilled, the condition of the upper next level is to be satisfied. For instance, an unpleasant smell excels uncomfortable demands like oscillation or noise (masking effect) [30, p. 148]. Vibrations as far as 2 Hz are visible for a human, implying a multi-incitation of perception channels for low-frequency shuffle and have to be contemplated separately [95, p. 6]. Here, we consider drivability in scenarios where the vehicle is driven on an even road surface, without inclination and with a steering angle equal to zero (straight ahead). More complex investigations are required to reproduce real-world scenarios, especially for comprehensive validation. The interactions between driving dynamics (longitudinal, lateral and vertical) must be analyzed and optimized in a conflict of targets during the vehicle development process. Most literature uses only the longitudinal acceleration of the vehicle for characterization and assessment of the oscillation; nevertheless, there are simultaneously coupled motions in the vehicle body.
Knauer [95, p. 15] showed, by collecting data from several studies, that a plurality of human organs and body components have resonances within the drivability range stated before (Fig. 3). Investigations have been carried out regarding human exposure to whole-body vibrations while preparing an objective correlation to human perception of vibrations. The targets of these studies were quantifying these vibrations to comfort, vibration perception, and motion sickness [76,p. IV]. The authors define a frequency range of 0.5-80 Hz for health, comfort and perception of vibrations, and 0.1-0.5 Hz for motion sickness [76, p. 1]. In the context of the present paper, human perception of vibrations is of most importance and is focussed on in the following remarks.
Exposing humans to specific vibrations differs in terms of human orientation (recumbent, standing, sitting) and the direction of vibration stimulation. As a result, the method described for quantifying vibration perception is based on frequency weighting functions W . An adequate weighting function must be applied depending on the oscillation scenario. The characteristic weighting curves are given by Fig. 4.
W f is used for motion sickness with vertical excitation and is not relevant for drivability. In contrast, W k is applied for exposure on the z-axis for a sitting or standing person or in a horizontal lying position with vertical vibration application   (excluding the head). W d is selected for a sitting or standing person (longitudinal and lateral excitation) or a human in a recumbent position and horizontal exposure. Additional weighting functions are introduced for a vertical exposure of the head of a recumbent human position (W j ), rotational excitation (pitch, roll, yaw) of a sitting person (W e ) and longitudinal stimulus at the seat back (W c ). Objectification of human perception combines weighting functions resulting in weighting a multi-channel exposure of driver or passenger. Most authors suggest the application of W c , in rare cases in combination with W k and W e and sometimes with refinement (e.g. taking into account the characteristic of amplitudes of the vibration signal) [68], [95], [127], [135], [151], [151].
All three show a distinct plateau-like peak region around the drivability-related frequency range of about 2-10 Hz.
Contrary to the methods from [76], some authors choose a different approach either by correlating human perception to an artificial neural network (ANN) [55], [97], [111], [147] or by empirical data gathered from volunteer studies [117], [118]. Hagerodt [69], supported by Fan [56], determined a very small deviation of perception of vibrations between driver and passenger from 3% [68, p. 32]. Even though the human perception is assumed to be the same between two people under the same circumstances, the demands may differ significantly depending on country or vehicle class. This relation has to be considered in terms of target cascading for VOLUME 11, 2023 drivability-related parameters but is not affecting the objective correlation of drivability.
Moreover, many other ratings in the context of vibration signals evaluation exist like standard deviation, root mean square (RMS), vibration dose value (VDV) or power spectral density (PSD). Although each approach is slightly different, studies prove an explicit correlation to each other [135, pp. 33-34].
Measurement equipment for drivability investigations generally contains angular speed sensors for relevant powertrain rotating components, longitudinal acceleration sensors, diagnosis and monitoring software for corresponding control units and a data acquisition system. Some suggest additional transducers like strain gauges for indirect torque measurement or acceleration sensors for recording rotational movement (for instance, pitch movement of the vehicle body or engine) [68], [135]. Longitudinal acceleration sensors are applied to the seat rail [56], [68], [69]. In cases where the input torque cannot be measured, it must be estimated from reference measurement data or received via CAN from the motor control unit (MCU). The torque value in the MCU is similarly derived from look-up tables and quite accurate (more than 95% accuracy [34, p. 6]). Aside from the previously introduced vehicle longitudinal acceleration, additional characteristic values are shown for a typical Tip-In manoeuvre (Fig. 5).
Thus, a Tip-In manoeuvre is substantially evaluated by a maximum acceleration a x,max , the two stationary acceleration levels before a x,st,1 and after a x,st,2 the load change, the resulting difference between both a x,st , the peak-to-peak acceleration values for the first two peaksâ x, 1−2 and the bandwidth at final stationary stageâ x, st as well as the time span evaluated for most noticeable vibrations T ev . Additionally, the acceleration gradient representing the maximum jerk a x,max is determined. Eventually, an interpolation of the most noticeable peak valuesâ x,n after load change to approximate a decay curve N (t) = N 0 · e −λt is utilized to identify the shuffle damping ratio. Guse et al. [66] suggest to take into account further characteristic parameter like stumble (sudden drop in acceleration just before maximum jerk) or the period of acceleration build-up for maximum jerk. Subsequent to evaluation of each parameter at once, a combined weighing of all factors resulting in one overall drivability evaluation score can be used. For instance, Shin et al. [151, p. 221] determine the weighting of the vibration dose value (VDV) equally to the residual values given by Fig. 5 to 50% each.
Correlating such parameters to human perception establishes objectification models. A precise and comprehensive objectification is vital in automating current and future drivability validation. As a matter of course, this is applicable for related validation fields, too.

D. STATE-OF-THE-ART DRIVABILITY VALIDATION METHODS
In most cases, drivability validation still takes place in time-consuming road testing [56], [68], [111], [117], [135]. Experienced application engineers must evaluate the vehicle's response to its inputs and tune it iteration by iteration to the market demands, respectively targets cascaded in the vehicle's design phase. This validation strategy is not automatable, time-consuming and demanding (e.g. skilled test engineers). Besides, road testing is conditional on the environment (e.g. weather, road surface). Since a roadworthy prototype vehicle is inevitable, such drivability validation is employed in the late development stages at the vehicle level. Yet, for this basic method, all required physical signals are available for examination.
Unlike vehicle road testing, a different validation approach has been implemented by using chassis dynamometer [50], [89], [108]. Similar to the road testing method, a prototype vehicle is required, which limits the usability of this method again to the late vehicle level validation stage. A chassis dynamometer offers high automation potential and reproducible environment conditions, illustrated exemplarily by the road surface at the drum's outer layer. Automation is achieved by substituting the human driver with robotic systems for human-machine interfaces (brake, accelerator pedal, steering wheel). In-depth drivability knowledge is the basis for correct automation and adaptation of the test vehicle to the test bed (e.g. tyre-road-interaction, road load simulation). Once the road testing setup is partially replaced by test rig infrastructure, some key parameters like vehicle acceleration are no longer directly measurable. In this context, Hagerodt [68, p. 69] suggests a specific connection between vehicle and ground, supplemented with a force sensor to calculate the specimen's shuffle movement indirectly.
In terms of validation of simulation models for drivability, some authors suggest a rather fast and comprehensive validation method by comparison the non-linear dynamics with frequency response functions [63, pp. 9-10], [33, p. 6], [34, p. 10].
While aiming for even more time and cost savings, validation methods of drivability have to be developed with application potential below vehicle level. Studies of the last two decades have shown investigations into drivability analysis and validation at the subsystem level [24], [25], [135]. Subsystem validation comprises powertrain test beds with the benefit of prototype vehicle reduction. Due to a higher degree of replacement of essential vehicle components and a need for residual vehicle supply by simulation models, this validation approach is the most complex and challenging. As computation power and budget for testing are limited, a virtual extension of the real powertrain system by a virtual residual vehicle model is often realised with simplified models. For instance, complex non-linear dynamics are neglected, like tyre slip dynamics, certain vehicle aerodynamics relationships, or friction [75, p. 73]. To ensure reproducibility of specimen behaviour, the comfort functions of the electronic control unit (ECU) are disabled [163, p. 417].
The state-of-the-art approaches for enhancing agility and efficiency in the vehicle development are illustrated in a V-model in Fig. 6. The vehicle as the system to be developed is designed and then validated from left to right, covering the stages of vehicle (system), subsystem (like powertrain or suspension) and components (e.g. motor, spring). By virtualisation of validation methods and a gain in agility in the V-model, three main phases have to be distinguished. First, a transition of testing from system to subsystem level is conducted (road-to-rig approach at test beds, (1)). The second step (2) is given by virtual validation transferred from the test bed to the full simulation environment (rig-to-desktop). The final approach (3) is demonstrated by carrying out typical road tests in the virtual domain (road-to-desktop). Currently, most validation tasks are allocated to stage (1): Road-to-rig. This level of virtualisation is contrary to the design and target cascading phase, where virtualisation has become a crucial role already [1], [12]. A typical road-to-rig approach is presented here by virtual validation of drivability at a powertrain test bed.

IV. VIRTUAL VALIDATION IN MATTERS OF VEHICLE DRIVABILITY
The vehicle as a complex mechatronic system is usually characterised in a V-model development process, which has been adapted from software development [23]. In contrast to the development of mechatronic systems utilising waterfall or V-model concepts, software development faced a significant increase in product complexity and demands to reduce testing efforts. Hence, software developers utilized more agile development approaches like Scrum [149] or Crystal Clear [37]. The virtualisation of the vehicle development process is very challenging and requires the integration of interdisciplinary methods into a common, performant and valid framework. Section four explains the challenges for virtual validation of drivability. Additionally, it contextualises this process within overall virtual validation activities in the automotive industry.

A. A SUMMARY OF X-IN-THE-LOOP APPROACHES TO DATE
The duration of one vehicle development cycle has been reduced constantly. A survey from Morley [123] in 2017 confirms that 68% of auto manufacturers have development cycles lasting less than two years. New methods are required to reduce the most time-demanding task of system validation.
Today, various in-the-loop methods are known and implemented in different industry branches like automotive, aerospace and defence development. All have a device-undertest (DUT), or unit-under-test (UUT) coupled to a more or less complex residual environment simulation. Starting with driving simulators for training of pilots [27, p. 71], HIL applications became more complex over time. One leading indicator for the complexity of a HIL setup is the number of real DUTs involved and the extent of environment simulation. In automotive applications, the environment simulation is named residual, or rest vehicle simulation [4, p. 3]. Most recent studies show rest vehicle simulation considering dynamic vehicle models, traffic simulation (other vehicles interacting with the vehicle) and environment simulation (tyre-road-interaction, traffic signs). With vehicle-to-everything (V2X) validation, these approaches result in multi-target simulations of entire cities, and complex scenarios [51].
Within the vehicle development process, the degree of complexity of the X-in-the-loop (XIL) approach continuously rises. In the first step, a model is the UUT and embedded into a simulated environment, enabling open-and closed-loop simulations (MIL), as well as early verification of requirements and algorithms [137, p. 1]. The next stage within a development process is software-in-the-loop (SIL), comprising control unit source code and corresponding interface simulation running on a standard PC [121]. Software verification is achieved without using the target processor/hardware. In implementing the software into the target operating system, a processor-in-the-loop (PIL) is realised. MIL and SIL methods have a virtual DUT in common. Here, real-time computation of the environment simulation is not required. Starting with PIL, the residual simulation of models at specified interfaces to the DUT must be real-time capable. The present review focuses on hardware-in-the-loop (HIL) related to automotive applications specified for the topic of drivability (shuffle). Since HIL consists of real and virtual components, the term virtual validation is used in this context concurrently.
Since HIL has no unique definition due to a plurality of fields of application, a suitable description is required for the subsequent explanations. Ahmad et al. [2, p. 1] refer to HIL as an operating system of real components linked to simulated ones in real-time. In contrast, Fathy et al. [58, p. 1] recognise a setup emulating a system by immersing faithful physical replicas of some of its subsystems within a closed-loop virtual simulation of the remaining subsystems. In the context of the present paper, HIL means a real-time linkage of hardware components or subsystems with virtual residual components (digital twins). It constitutes hybrid overall system behaviour for early, agile and cost-effective calibration and validation purposes within a product development process. These systems are distinguished either as monolithic (single DUT) or distributed (multiple DUTs) [27, p. 78]. A requirement specification for fast communication between all corresponding interfaces leads to a change in data exchange from Ethernet (user datagram protocol or UDP) to EtherCAT or direct (analogue) linkage of components (see examples [51], [65], [66], [96]).
Typically, HIL is deployed where the safety of the specimen and operator are at risk or validation, and verification is dangerous or even impossible to conduct in the real world [27, p. 71]. Fault investigation activities, adapted from basic ECU testing, become more crucial as the algorithms evolve and incorporate plenty of interfaces and interactions with the environment [161, p. 5]. Moreover, automation of vehicles necessitates reliable validation methods. In this context, functional testing described in ISO 26262 [80] significantly impacts upcoming virtual testing [113].
As there are diverse types of hardware specimens, some authors are more specific about naming the methodical approach in terms of the actual DUT type. The following classification of HIL systems applies to this paper:  Table 5. HIL utilisation requires initial investments in infrastructure, softand hardware solutions, and skilled staff for operation and maintenance. Maintenance in this context applies to test beds and the simulation models. Availability of simulation models and the effort for parameter identification at each stage in the development cycle are further decision criteria for employment of virtual validation [6, p. 983].
Today's virtual vehicle validation demonstrates a tendency for complex XIL framework development. Albers et al. describe a general XIL framework containing a GUI, interfaces to hard-and software components, and a model library for a driver, environment and rest vehicle [4, p. 3]. To ensure tool consistency, customized solutions are not constructive [50, p. 54]. Utilization of standardized modelling languages like UML and SysML are recommended for cross-domain modelling [50, p. 70]. In this context, Albers and Düser present two general concepts for problem solution approaches (SPALTEN) and universal model handling (Contact-Channel-Model, C&CM) [3, p.3]. SPALTEN describes a methodology for solving problems, from situation analysis to real solutions and learnings. In contrast, C&CM means a generalised modelling concept based on working surfaces, channels, and support structures [3, p.3]. Such a modelling concept is applicable for functional (e.g. driver and controller models), physical (e.g. actual torque) and framework (e.g. bus communication) descriptions. Both concepts provide a basis for a generalized XIL framework, that is described in detail in [3], [4], [5], [6], [50], and [165].
Another instance is given by a method called virtual shaft [8], [9]. By virtualising components, interfaces between real and virtual components and their interdependencies are established. In the case of virtual validation at the test bed level, performant sensors, actuators and virtual shaft algorithms are deployed for the highest validity of the HIL application. The algorithms must calculate physical interactions in hard real-time considering sensor measurements and controlling actuators (e.g. electric motors for torque input). Andert et al. [9, p. 31] claim a lower workload of complex and highly-customised test beds compared to test beds at subsystem or component level and therefore a significant potential to increase testing efficiency by virtual shaft methods in a road-to-rig-to-desktop reasonable manner.
In summary, prevailing challenges for virtual validation and XIL frameworks are increasing complexity of rest vehicle and environment simulation, optimised trade-offs between accuracy and computational power, and influences from global vehicle development. The first topic becomes manifest in connectivity (vehicle-to-vehicle or V2V,V2X) of the DUT and related control units which must be reproduced for adequate XIL validation [29], [51], [73], [120], [143], [156], [158]. Szalay considers a digitalised test environment, real-time localisation, low-latency communication (5G), and controllable scene objects as crucial components for an automated V2X virtual validation [156, pp. 35620-35622]. The second challenge is demonstrated in dealing with non-linearities in real-time simulation models, that are demanding in computational power like tyre [59], [60], [89], [90], suspension [89], [90], [103] or friction models [9], [99], [141]. Finally, globalisation in vehicle development illustrates a very tough challenge for virtual validation implying global original equipment manufacturers (OEM), suppliers and distributed test facilities. You and Niu show the first studies on globally distributed XIL applications for powertrain and static driving simulator test beds resulting in large communication latencies and demands for future technologies [125], [126], [165]. Here, Niu et al. successfully demonstrate prediction methods to compensate for latencies based on neural networks [126].

B. HARDWARE-IN-THE-LOOP IN THE CONTEXT OF VEHICLE DRIVABILITY
Drivability is considered one of the most time-consuming calibration tasks in vehicle development [41, p. 1]. The application of dual-clutch transmission comprises about 200 parameters [5, p. 4]. Virtual validation based on roadto-rig approaches with adequate rest vehicle virtualisation, objectification of characteristic values and automation of the calibration tasks offers a designated front-loading and efficiency improvement potential [136]. Model-based development ensures future competitiveness in complex vehicle development [11,  A requirement specification for the HIL setup has to be determined regarding virtual drivability validation. The relevant maximum frequency f max of drivability phenomena amounts to 30 Hz for current specimens (see section III-B). Two aspects come to the fore in the process of deriving a requirement for maximum macro step size for a suitable HIL application: First, all combinations of sensors and data acquisition systems must allow a sampling frequency f s of at minimum 60 Hz according to the Nyquist-Shannon theorem [150] (Eq. (2a)). Second, a stable and reproducible closed-loop performance [116, p. 437] is reached for control loop frequencies f HIL of the HIL system of about 180. . . 600 Hz or a macro step sizes of about 1.7. . . 5.6 ms (Eq. (2b)). An overview of the studies reviewed for X-in-the-loop activities is shown in Tab. 6. The realized macro step sizes of the studies reviewed are presented in Table 7. For utilization of a continuous controller as a time-discrete version f HIL > 20 · f max applies [116, p. 437].
In addition, time characteristics of the control loop like dead time and jitter increase the demand for smaller step sizes. Many studies show approaches where transfer functions of type PT1 or PT2 are used for system dynamics representation [8, p. 103], [144, p. 48]. A transfer function in combination with a Padé approximation represents system dynamics and dead time behaviour [144, p. 48], [115, p. 345]: Eq. (3) determines the electric motor torque build-up as a PT1 transfer function with dynamic time constant τ EM ,dyn and dead time τ EM ,dead . The type of EM and its integration concept show varying torque response times. Lindvai-Soos and Kaimer [110, p. 49] present an all-wheel drive concept for BEV (battery electric vehicle) of D-segment with a disengageable PMSM as a secondary drive unit (ASM: 60. . . 70 ms, PMSM: 250 ms).
Another example is the relaxation length of the tyre σ x , a first-order delay whose consideration in a HIL application is necessary due to the impact on the overall damping ratio.
The longitudinal relaxation length as the ratio of longitudinal carcass stiffness C s to longitudinal slip stiffness C x means the distance of the tyre to be covered by rotation to generate 63.2% of steady-state longitudinal force [89, p. 8].
Numerical stiffness and singularity issues of deployed simulation models are other challenges for virtual drivability validation. A typical example is the calculation of longitudinal wheel slip s i since the tyre modelling is one of the most complex mathematical modelling challenges [114, p. 34]. Here, introducing a speed threshold v th prevents the singularity problem, especially for driveaway events [89, p. 7]: where ω i refers to the angular velocity of the wheel, r d is the dynamic tyre radius, and v is the longitudinal velocity. An appropriate method for virtual validation of drivability at a powertrain test bed demands in-depth system understanding regarding the complete closed-loop HIL setup. It is mandatory to match the DUT characteristic in a HIL application at the test bed level to the actual behaviour of the DUT implemented in the vehicle. In the context of drivability, this applies to dynamic properties, for instance, natural modes, damping ratio or transient dynamics (e.g. dead time). The matching approach has to be model-based. As detailed system identification of the complete HIL system is demanding and inexpedient within the development process, new methods have to be found here.

C. OPPORTUNITIES AND THREATS FOR VIRTUAL VALIDATION OF DRIVABILITY
In the following, we discuss opportunities and threats of virtual validation in vehicle development. First, the virtualization of vehicle validation tasks shows significant improvements in testing efficiency and agility. Virtual engine calibration allows between 20% [48, p. 13] [91, p. 244]. As a consequence, some authors postulate a strict transformation of testing activities from road testing to rig and simulation environment [54, p. 2].
The general advantages of modelling give another benefit of virtualization of the validation task. Modelling physical systems always means simplifying the system's behaviour to a certain degree. Implementing virtual validation methods into an experimental testing environment can simplify accessories by scaling, model partitioning and surrogation concepts. All those three concepts enhance efficiency by reduction of computation demand and costs.
The concept of scaling reduces the complexity of the DUT as it is downscaled. For example, a traction battery for an efficient vehicle simulation is modelled by a representative battery cell that is upscaled by simulation models in the virtual world [40]. Real prototype hardware for a full battery is not required and saves costs and efforts for maintaining and operating the specimen. However, there is a risk for scaling. Scaling effects occur if the difference in scale of different HIL components overshoots a certain level [132, p. 1078]. Those effects might primarily occur in coupled virtual-experimental environments, where latency-related communication technologies link physical and virtual signals. For instance, we might understand a simple battery cell model. But by upscaling such a model, there can occur effects between the primary cell model and its scaled duplicates that we do not know of or do not consider in the model from the first point. Unknown effects are a threat to a transparent method utilizing virtual validation.
The method of partitioning uses decoupling model parts so that their interdependencies are reduced. The target is to re-group subsystems of similar time dynamics and isolate stiff numerical parts [92, p. 5]. By doing this, the computation task is reduced. Future work in this context will investigate model partitioning for multicore architectures, parallel computing, and variable step solvers [92, p. 8].
Model surrogation, as presented by Kozaki et al. [99] in a six-step approach, aims to reduce computational demand by model simplification. The six-step algorithm, reviewed for a manual transmission dynamics simulation, results in 63% time reduction of the simulation task [99, p. 3]. This is realized by customized linearization, elimination of higher-order or irrelevant fast dynamics [99, p. 2], [71]. In summary, model surrogation contributes to an efficient virtual validation task alongside model scaling and partitioning principles showing significant benefits.
Compared to opportunities for strengthening efficiency, one major problem for virtual validation is caused by a missing standard for proof of the validity of its methods. Verification refers to a specification check for a system at various levels and degrees of detail. In contrast, validation means a proof of system characteristics regarding predefined use [21, pp. 7-8]. Both terms are used to compare actual characteristics to specified demands of virtual and real prototypes. Within the scope of road-to-rig-to-desktop approaches, the conventional definitions do not cover the required scope for hybrid testing via HIL. Schmidt and Frings [146,p. 49] find a first standard for handling of virtual validation in the ISO 19364 (steady-state circular driving, [79]) and ISO 19365 (sine with dwell stability control testing, [78]). Data points from simulation and road testing must be analysed in a cross plot for certain characteristic values like steering-wheel angle or lateral acceleration. The virtual validation is considered valid if the testing results all lie within the simulation data range [79, pp. 7-9]. This example shows only a rough, minimal definition for virtual validation handling, not to be generalised.
Within the studies reviewed, there are many terminologies for requirement specifications for virtual components (simulation models/software) in terms of HIL virtual validation. Riel where t i refers to the test execution time for a sample, t T is the total test execution time of all tests of a group A, K i is a shape factor and C i is the test reliability. K i is used in cases where the HIL test is more accurate than the real-world test. This specific case occurs, for example, if the experimental results of the larger group of tests strongly depend on a certain temperature value, and this value is hard to be held constant. A HIL setup, where the physical temperature chain of interactions is virtualised, is beneficial and hence more reliable and representative. The reliability C i is given by [49, p. 6]: Here, P in and P out mean the input, respectively, output ECU board precision and T E is the equivalent trueness of the dynamic model used in the ECU. With this calculation, hardware errors and tolerances of the control and interface units and modelling inaccuracy are taken into account. The trueness is derived from the Normalised Root Mean Square Error (NRMSE) of predictedŷ i and measured values y i [49, p. 4]: The calculation of HIL representativeness contributes to an overall approach to defining and measuring validity in virtual validation tasks. Other studies suggest methods for proofing validity by manoeuvrer-based validation of hybrid drives [119] or scenario-based proof of ADAS (advanced driver assistance system) functions [156]. Viehof presents a validation concept based on objective, stochastic principles in the context of driving dynamics [160]. The primary purposes for the validation concepts are traceability, objectification, practicability and expressiveness [160, p. 6]. The validation results iteratively influence real and virtual HIL component design from a single test bed to a complete test facility [32] and geographically distributed test centres (see examples [125], [126], [165]). The opportunities and threats for virtual validation that we mentioned before must be considered for designing such test beds.
In conclusion, the validation concept is a crucial enabler for virtual validation. Due to complexity issues from experimental testing, modelling and process peculiarities, a robust methodology for proofing overall virtual validation reliability and correctness is considered the basis for all other related research activities. A performant validation concept for virtual validation supports not only the digitalisation of the testing process in general but also determines an optimal requirement specification for the HIL setup. The future research focus for validation must consider not only the specimen characteristic (real or virtual) but also the testing equipment like test beds or driving simulators as proper development tools in a consistent toolchain for agile vehicle development. A Product-Lifecycle-Management system (PLM) must utilise knowledge about the HIL system, all models and measurement results [64], [146]. Automated failure detection of outlying measurement data assists the validation task as well [138]. Drivability control functions have already been subject to ADAS functions like ACC or AEB. Therefore, the same risk in terms of function approval applies as it does for all other ADAS-related applications. Wachenfeld et Winner [162] amongst others, estimate a driving distance of autonomous test vehicles in road testing of about 2-36m test kilometres under certain conditions. There is no longer the question of whether to validate virtually but how to comprehensible realise virtual validation.

V. CONCLUSION AND OUTLOOK
The vehicle development process undergoes a significant transition by virtualising the validation methods. The increasing percentage of software algorithms and functions do not only convert the vehicle into a smartphone on wheels but also request for a re-definition of vehicle development and validation strategies. Conventional approaches from design to validation phase in a V-model are not performant enough for ever more intelligent cars. Those cars' development process does not finish by SOP but proceeds thanks to software over-the-air-updates as it does for personal computers or smartphones. The authors have shown many interdependencies in the vehicle development process, meaning that the development process cannot be changed instantly. A continuous transformation by digitalisation and virtualisation of development methods contributes to more efficient and agile development.  A systematic literature study presents a rapid increase in the complexity of virtual components. Coming from straightforward ECU calibration, recent applications relate to complete vehicle models as part of a comprehensive virtual environment up to geographically distributed, multi-target traffic simulation of complex vehicle dynamics. The reviewed studies incorporate various DUTs at component, subsystem and system levels, with varying model and specimen complexity. All HIL applications are combined into a representative automotive HIL overview, providing an appropriate nomenclature and a standardised definition for HIL. The trade-off between the accuracy of simulation models and   computation demand is pointed out by contemplation of realised macro step sizes of basic application examples. Simple lumped torsional oscillation models with less complexity achieve cycle frequencies up to 8 kHz, whereas very complex V2X tasks only feature 10-40 Hz. Geographically distributed validation activities represent today's most demanding automotive HIL applications. The feasible macro step size is in the range of 250 ms due to the latency effects of the signal transmission. Concerning the fundamentals of signal, system and control theory, the current technology performance determines the representable frequency range by the macro computation step size indicator. Responsible engineers must debate the practicable degree of virtual validation with regard to the physical phenomenon, the availability of the simulation models and the justifiable parametrisation effort.
Vehicle shuffle with a frequency range of 2-10 Hz for conventional specimens represents a typical example of current challenges for virtual validation. This field of validation's state-of-the-art still appears in testing on the road or on a roller test bench. Since the boundary conditions for computation demand are lower for drivability, phenomenon reproduction with higher accuracy is possible. The assumption of future vehicles being electric driven, autonomous and connected affects most vehicle's attributes. In terms of drivability, electric motorsports already highlight stiffer powertrain components, increasing the shuffle frequency up to 20-40 Hz. Automation of the validation task exploits the full potential of virtual validation. Here, objectification describes a crucial enabler for process automation, especially in terms of drivability.
Missing standards give a fundamental problem for virtual validation for proof of validity, respectively ascertainment of fidelity of the HIL application. Recent studies suggest using quantitative methods based on the calculation of trueness, precision and reliability, resulting in an overall HIL representativeness. Therefore, a holistic, detailed investigation into the HIL system is usually not affordable or realisable. Some information on the HIL system's components is inaccessible; thus, the HIL setup presents a control loop of several black-box parts. Suitable, general proof of validity for virtual validation must be feasible for all development engineers. Based on system identification methodologies, future research is required to determine the transfer behaviour of HIL black-box components. With this information, a HIL representativeness and validity calculation is performed, highlighting the actual limitations and inaccuracies of the virtual validation approach.
The strong demand for even more efficiency and agility in vehicle development enforces virtual validation. During a continuously updated process, potential fields of virtual validation in the development process must be identified and realised. In this context, future work should establish a consistent toolchain of experimental and virtual tools stage-bystage alongside the V-model. Common standards for virtual validation within the automotive sector enhance confidence in the methods and usability in the heterogeneous, global development process. Table 4.

APPENDIX B ADVANTAGES AND DISADVANTAGES OF HIL APPLICATIONS
See Table 5.

APPENDIX C REVIEWED HARDWARE-IN-THE-LOOP APPLICATIONS
See Tables 6 and 7. VOLUME 11, 2023

DISCLOSURE STATEMENT
The authors report there are no competing interests to declare. Since 2010, he has been the Head of the Department of Durability, NVH and Validation Methods, Institute of Automotive Technologies Dresden, Chair of Automotive Engineering, TUD. Aside from that, he is responsible for the build-up of a comprehensive laboratory called the Automotive Testing Facility, including more than 15 test stands. His current research interests include elastomer mounts, motorcycle testing, virtual validation, and XiL development methodologies. From 1998 to 1999, he was a Visiting Scientist with the Mechanical Systems Control Laboratory, University of California at Berkeley, Berkeley, CA, USA, with a focus on modeling human vehicle driving by model predictive online optimization. He has worked in the fields of driving dynamics and objectification of the same with AUDI AG, Ingolstadt, Germany, from 1999 to 2002. Later, he had leading positions with BMW AG, Munich, Germany, in the fields of driving dynamics, functional safety, and whole vehicle validation, from 2002 to 2010. Since 2010, he has been a Full Professor with the Chair of Automotive Engineering, Technische Universität Dresden (TU Dresden). Aside from that, he has been the Dean of the ''Friedrich List'' Faculty of Transport and Traffic Sciences, TU Dresden, from 2018 to 2022. Most recent projects include the development of a self-propelled driving simulator and the build-up of a comprehensive laboratory called the Automotive Testing Facility comprising more than 15 test stands. He is the author of more than 160 articles and two books, and holds four patents. His research interests include driving dynamics, ride comfort, human-machine-interface, driving simulators, and virtual development and testing. VOLUME 11, 2023