A Perceptual Model-Based Approach to Plausible Authoring of Vibration for the Haptic Metaverse

Haptic virtual reality is often misunderstood as being solely a physically identical copy of real environments. Thus, a perfect recording and reproduction of vibration that is indistinguishable in an A:B comparison is often the aim. However, in most virtual reality applications the real environment is not available for direct comparison. Instead, when judging the plausibility of a presented scene, the user compares the vibration to his expectations shaped by the audiovisual context. Therefore, it should be sufficient to find any vibration that the user expects to potentially occur in the given context. Such a vibration needs to elicit a perceptual profile with a minimal distance to an expected profile in the sensory tactile perceptual space. Building onto this formalization, this article demonstrates a novel generative model-based approach to authoring vibrations. First, users quantify expectations as tactile profiles consisting of ratings of six sensory tactile attributes without the presence of vibrations. Subsequently, the model predicts vibration parameters from such profiles. This ensures the fulfillment of user expectations and thus high plausibility. Furthermore, it eliminates the necessity of recordings, infeasible for scenes with no real counterpart and opens the door to crowdsourcing the authoring process with laypersons for the haptic metaverse.


I. INTRODUCTION
V IRTUAL reality interface technologies have seen a re- markable progress recently, e.g., head mounted displays or even haptic interfaces.This has enabled new virtual reality applications in many fields [1].They offer the advantage of exposing the user safely to a controlled environment, e.g., in This work involved human subjects or animals in its research.Approval of all ethical and experimental procedures and protocols was granted by the Ethics Committee of the Technische Universität Dresden under Application No. SR-EK-111032020-Amendment, and performed in line with the Helsinki Declaration.
Digital Object Identifier 10.1109/TOH.2023.3318644flight simulator or driving simulators.But other applications are increasingly utilizing virtual reality as well, e.g.e-commerce [2] where users can virtually test drive cars or even experience tactile interaction with products such as clothing.In the health sector they are being used for virtual phobia therapy [3].Vibration is added to cinemas to create a form of virtual reality and thus a more immersive experience [4].The technological progress gave rise to the vision of the Metaverse.Fusing together several views [5] the core concept of the Metaverse is aiming at connecting users in a shared immersive virtual environment.Instead of expert created environments of classical virtual reality, the metaverse should involve or even enable users to author such environments themselves.
All the applications have a common goal, i.e., that the users should perceive and interact with the virtual environment just as they would with a real environment.There are two factors influencing such a realistic experience as suggested by Mel Slater [6]: the place illusion and the plausibility illusion.The Place Illusion is mostly constrained by the technical capabilities of the virtual environment, e.g., insufficient display resolution or latency breaking the immersion.The Plausibility Illusion, depends on the content of the depicted scene, i.e., whether it relates to the user expectations.While the place illusion received much interest from researchers, the plausibility illusion has attracted much less attention.There are two general approaches to elicit the plausibility illusion: the authentic approach and the plausible approach [7].

A. The Authentic Approach to Authoring Vibrations
The authentic approach aims at eliciting percepts in a virtual environment that are identical to the real environment, i.e., they are indistinguishable in an A:B comparison.This approach requires a lot of effort because the creator has to record the real environment properly, requiring equipment and expertise.This makes it impractical for a large number of scenes.Alternatively, modelling the physical excitation process is only practically feasible for a narrow set of contexts.Both methods assume physically accurate reproduction.This is often difficult because many vibration reproduction systems such as linear resonant actuators are often constrained in bandwidth and dynamic range.One of the main advantages of virtual reality is that they are in theory not limited to scenes with a real counterpart.However, to record the real counterpart of a virtual environment, access to the corresponding real environment is required.This makes the approach infeasible for context such as e.g., "space ships".
To overcome these disadvantages, it has been suggested to utilize optical or acoustical signals to generate vibrations algorithmically.Depending on the content of the scene, [8] suggests the division into slow and fast point of view movement, discrete and continuous object movement, impulses and vibration from e.g., explosions, and context movement, e.g., vibration following a step on the gas pedal of car.Only for slow movements, the reconstruction from the optical flow of the scene is easily feasible, while the other classes mostly require an understanding of the spatiotemporal or even semantic content of the scene.The findings of [9] confirms that such reconstruction might work for scenes with simple semantics but is difficult for more complex semantics.For optical signals in the context of driving on a bicycle or motorcycle this has been demonstrated by [10].The low sample rate, i.e., the frame rate of typical for videos limits the reconstructible frequency range, impeding the plausibility or requiring further heuristics for the algorithm.
Extracting vibration from acoustic signals is another option, that has been attempted by [4] from the LFE channel of concert DVDs.Again, the extractable frequency range is limited to 20 Hz to 120 Hz.Furthermore, when comparing the extracted vibration to manually designed vibration, the users perceive it as much less plausible.Overall, these approaches are oblivious to the semantics of the scene and to the expectations of the users.Therefore, the algorithm cannot ensure the plausibility of the produced vibration and thus the creator would need to validate the actual plausibility manually in a user study.The current methods of the authentic approach require experts for the authoring either for conducting measurements or for improving non-optimal reconstruction algorithm outputs.Thus, they do not readily enable the users to author the vibrations for the environments themselves.
Furthermore, the authentic approach implies that the recorded vibrations are always optimally plausible in the virtual environment.However, putting the user expectations at the center, might explain some surprising effects.For example, in a virtual basketball game, the users perceives the visual and auditory cues of the ball dribbling.Thus, they would like to experience matching tactile cues despite that such cues are not present in the real environment.Indeed, watching the players dribbling the ball was perceived to be much more plausible when matching vibration was presented to the feet, despite these vibration not being perceivable while sitting on the ranks of the sports stadium [11].It has been suggested, that in the presence of contradictory stimuli in different modalities, the most convincing cue of the context can dominate perception [12].Indeed, it has been demonstrated that the tactile expectations of vibration are modulated by the audiovisual context and affect plausibility ratings as well as neural activation patterns [13].

B. The Plausible Approach to Authoring Vibrations
Given that in most applications the real environment is not available for direct comparison to the virtual environment anyway, the question arises whether the authentic approach is worth the effort.Therefore, the plausible approach only aims at creating tactile sensations that might have occurred in a comparable real environment [7].Thus, the user has to compare the vibration presented in the virtual environment to his expectations.Inspired by the definition of quality as a measure of match between expected and elicited properties of a product [14], the plausibility judgement can be similarly defined [15].Therefore, vibrations are only optimally plausible in a context, if the elicited tactile profile matches the contexts' expected tactile profile.Since any vibration fulfilling this requirement is acceptable, the approach seems far less demanding in terms of accuracy of vibration assessment or reproduction.However, while it is easy to find out whether a given vibration is indeed plausible, the inverse task of finding the vibration properties that make it plausible is much more difficult.
Therefore, the status quo is to run trial and error and iteratively evaluate the design by presenting vibration in user studies until the creator finds a plausible vibration.Especially with little experience, this is quite inefficient.However, relying on experienced design professionals can only ensure that the expectations of the expert are met, that are not necessarily representative of the average user.To avoid repeating manual vibration design for the same context, it has been suggested to create databases of context -vibration effect mappings [16], e.g., a vibration effect for the tactile sensation of feeling rain on the shoulder.Unfortunately, to create a new item in the database, the creators still has to repeat the design process for each new scene.Furthermore, even for gradual changes in the situational context they have to repeat the design process.
Since the same vibration may be plausible in various situational contexts, one solution would be not to link a vibration to a specific situational context (e.g., feeling rain on the shoulder).Instead, the creators can provide a context independent description of the vibrations from a perceptual perspective, i.e., with tactile attributes such as "tingling".Thus, they could re-use the vibration designed for one context in a different perceptually similar context.Such an approach intended for product design, aiming to speed up the finding of suitable feedback vibrations, has been shown by [17].However, simply populating the database with arbitrary, i.e., non-parametric vibration signal items, still impedes the creation of nuanced feedback.If the relationship between vibration parameters and the elicited tactile attributes is not known explicitly, no targeted gradual changes are feasible.Tuning the physical parameters of discrete vibration items (e.g., energy) of a database was proposed by [18] to effect continuous changes in perceptual attributes for tactile product design.Their results suggested, that the tuning relationships between vibration parameters and perceptual attribute ratings varied depending on the vibration item selected i.e., they were not generalizable across all items.
To properly take into account the relationship between vibration parameters and the elicited perceptual attributes it is necessary to sufficiently describe the everyday life vibration by their physical parameters i.e., level, frequency, etc. as well as sufficiently describe the tactile perceptual space.This was attempted by the authors in [19].They generalized vibration into four excitation patterns (impulse-like, bandlimited white Gaussian Noise, amplitude-modulated sinusoidal, sinusoidal) characterized by the parameters level, frequency, bandwidth, Fig. 1.Overview of the concept of the novel approach to authoring vibrations.First, for a communicated context, an expected tactile profile is defined by future users with the tactile design language proposed in [18].This tactile profile is input to the regression model that predicts parameters of a vibration with an elicited tactile profile similar to the expected profile.Finally, the model outputs a vibration that is matching expectations on the given context and is thus plausible.modulation frequency and decay constant.They represented the sensory tactile perceptual space by the six tactile attributes "weak," "up and down," "tingling," "repetitive," "even" and "fading".Based on the observed mappings [20] the authors of [19] proposed to enable more gradual changes by creating new vibration items from the dataset though interpolating the vibration parameter profiles as well as tactile profiles containing ratings of these six perceptual attributes.They validated the approach by assessing tactile profiles of audiovisual tactile scenes and finding a vibration item with a similar tactile profile.The vibration selected only by its tactile profile was perceived as similarly plausible as recorded vibration in a virtual environment when presented in the audiovisual context assessed with the vibration recordings in a real environment.This implies that the six tactile attributes are sufficient for vibration generation.
The disadvantage of the presented approaches is their reliance on databases with a limited number of discrete vibration items.Thus they are neither generalizable for a wide range of scenes nor enable finely granular changes.Therefore, only a model that can predict vibration parameters continuously from tactile profiles could enable optimal vibration for the systematic elicitation of the tactile plausibility illusion for arbitrary scenes.The current methods of the plausible approach seem to enable the users to author the vibrations for the environments themselves, because they can select suitable vibrations from a database according to intuitively understandable perceptual attributes or even by context descriptions.However, there is mostly still a strong necessity for tuning the parameters of the vibrations to match the audiovisual context optimally, which would be difficult for non-experts without perceptual handles.

C. Goal of This Study
Applying the plausible approach, the goal of this article is to assess a vibration efficiently, which would be perceived as highly plausible in a given arbitrary audiovisual context in a virtual reality.Building onto [19], [20] a synthesis model is built and validated.The core principle is an explicit quantification of the expected tactile profiles and their translation into vibration signals with an elicited tactile profile maximally similar to the expected profile (see Fig. 1).By relying on the layperson understandable tactile design language [19] for quantifying the expected perceptual profiles, the method should enabled users to author vibrations themselves for the future haptic metaverse.

A. Dataset
For the synthesis model to be applicable over a wider range of scenarios, a sufficiently large dataset is required.In contrast to visual judgements that are facilitated by ubiquitous and homogenous hardware, collecting huge tactile datasets remains a challenge, since reproduction system heterogeneity impedes crowdsourcing tactile judgements.Sampling vibration randomly would require an extreme amount of recordings and corresponding perceptual judgements.Furthermore, such a strategy would strongly demand to demonstrate generalizability of the model by cross-validation, reducing the training data set further.Therefore, the necessity of more efficient strategies arises.One potential solution suggested by [21] for the auditory domain is to approximate the physical excitation processes producing the sounds iteratively.If progressively changing excitation processes are approximated to successive segments of stations excitation processes and impacts, then vibration may be caused by four general classes of excitation mechanisms according to [22]: Periodic mechanical processes i.e., sinusoidal vibration, correlated periodic excitation i.e., amplitude modulated vibration, superimposition of uncorrelated sources i.e., variable bandwidth narrowband noise, and impacts.
Although successive segments of these fundamental excitation patters may approximate all vibration producing haptic interactions, it would be a drastic oversimplification for the goal of physically correct, i.e., authentic vibration synthesis.In contrast, the plausible approach acknowledges that tactile receptors can only resolve a fraction of the variation of such excitation.The constrained capabilities of tactile receptors, as evident from investigations of frequency selectivity of tactile receptors (the just noticeable difference in frequency (JNDF) is far above auditory perception, i.e., approximately only 30 % [23], [24]) and masking effects [25], [26], results in a reduced ability to resolve temporally and spectrally complex vibration beyond these generalized vibrations.Therefore, by varying the defining parameters of these four excitation patterns within the ranges mediated by the tactile receptors, variations in frequency, level, and temporal envelope can be produced.Thus, a stimulus set representative of vibration caused by the majority of haptics interactions may be assembled.Compared to a random sampling of stimuli, this generalization has the advantage of producing a minimal stimulus set requiring a minimal number Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
of ratings and is thus far more efficient.Furthermore, a model trained on a randomly sampled training data set strongly requires cross-validation to demonstrate generalizability.In contrast, a model build from a training data set constructed systematically by including the perceivable range of vibration variation should be more readily generalizable to the perceivable variation of vibration.However, the question of generalizability is shifted to the question of whether the utilized basic excitation patterns are perceptually sufficiently similar to recorded vibration, i.e., equally plausible.This cannot be answered satisfactorily in a formal model validation but only in a perceptual validation study.Therefore, we provide some basic model characteristics (classification error, goodness of fit) and focus instead on the perceptual evaluation for the model validation.The modelling thus builds onto such a dataset of these four generalized vibration assessed in [19].To build a representative data set, the authors generalized everyday life vibration according to their temporal and spectral structure into four fundamental excitation patterns enabling systematic variation of their underlying physical parameters: 1) Sinusoidal Stimuli: Periodic mechanical processes can be approximated by sinusoidal acceleration signals characterized by the frequency f and the vibration level L. 21 sinusoidal stimuli were created by varying the frequency (1 Hz, 2 Hz, 5 Hz, 9 Hz, 15 Hz, 26 Hz, 55 Hz, 90 Hz, 155 Hz, 275 Hz, and 500 Hz) over the range of tactile receptors and level from 10 dB to 36 dB above sensation level (SL) of vertical seat vibration.Due to the approximate linear relationship between vibration level and perceived intensity [27] only two levels were selected to cover the range from perceptual threshold to ISO 2631 exposure limits.
2) Amplitude-Modulated Stimuli: Correlated periodic excitation can be approximated by amplitude-modulated (AM) signals according to: where A is an acceleration constant determining L, f m is the modulation frequency, f c is the carrier frequency, and m is the modulation index.30 AM sinusoidal were created with the frequencies (9, 15, 26, 90, 155 Hz) and levels (10 dB (SL) and 36 dB (SL)) of the sinusoidal stimuli.Modulation index was set to one and modulation frequency varied from 2 Hz to 5 Hz, 7 Hz, 15 Hz, and 26 Hz.Lower and higher modulation frequencies are increasingly similar to unmodulated vibration from a perceptual standpoint [28].
3) Bandlimited White Gaussian Noise Stimuli: The superimposition of uncorrelated sources can be approximated by white Gaussian narrow-band noise stimuli characterized by level L, center frequency f and bandwidth f b .The bandwidth f b was set to 25 Hz, 50 Hz, 100 Hz, 200 Hz, and 400 Hz and the center frequency f c accordingly to fit in the frequency range from 1 Hz to 400 Hz resulting in 22 stimuli.
4) Impulse-Like Stimuli: Impacts observable in a massspring-damper system excited by a shock can be approximated as a sinusoidal stimulus with an additional decay constant (α) [29]: The resonance frequencies of the 18 impulse stimuli were identical to the range of the sinusoidal stimuli (3 Hz, 5 Hz, 9 Hz, 26 Hz, 90 Hz, and 275 Hz).The exponential decay constants (α) were chosen to include the behavior of a highly damped (α = 8) and a weakly damped (α = 2) resonance system.Because of the short duration of the decaying impulse these stimuli levels were increased to 30 dB (SL) and 42 dB (SL) to account for temporal integration properties of tactile receptors [30] and to remain clearly perceivable.
The stationary stimuli had a duration of 9 s including a fade-in and fade-out of 0.3 s.The impulse-like stimuli had a fade-in of half the oscillatory period of the impulse resonance frequency to enable correct reproduction.The generated vibrations were presented as vertical vibration on a seat.
For each of the 91 vibrations, ratings from 29 participants for the 21 most frequently mentioned attributes were obtained.Subsequently, participants rated these 21 attributes on a 100-point scale with the verbal anchors "not at all" (0), "slightly" (25), "moderately" (50), "very" (75) and "extremely" (100) according to [31].A multivariate analysis in the form of principal component analysis informed the selection of six minimally-correlated tactile attributes ("weak," "up and down," "tingling," "repetitive," "even" and "fading") representing the dimensions of the perceptual space of vibration [19].Each of the six attributes revealed different relationships to physical stimulus parameters relevant to the subsequent modelling.
r "weak" has a high correlation to the sensation level.r "up and down" has a highly negative correlation to (carrier, center, and resonance) frequency.
r "tingling" has a highly positive correlation to (carrier, center, and resonance) frequency.
r "repetitive" has a high correlation to modulation frequency r "fading" has a high correlation to the decay constant.r "uniform" has a high correlation to the bandwidth param- eter.

B. Model Structure
Since this article intends to demonstrate a proof of concept of the novel authoring approach, its domain is limited to segments with quasi-constant tactile profiles for the purpose of this study.The tactile design language of [19] enables the assessment of standardized expected tactile profiles.In Section I-B tactile plausibility was formalized as being negatively correlated to the distance between an expected tactile profile and a tactile profile elicited by a vibration for a given context.Based on this formalization, the goal of the modelling is to find a vibration, whose elicited tactile profile is maximally similar to the expected tactile profile.Therefore, a potential synthesis model should ideally directly predict vibration parameters from the expected profile.Given a sufficiently low modeling error, any vibration eliciting a tactile profile that is similar to the expected tactile profile should be plausible for the given context.
The modelling attempts to find a compromise between explanatory modeling and predictive modelling.(see [32] for comparison).Thus, instead of assessing a black-box model and minimizing prediction errors, it is also the goal to demonstrate Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.statistically significant explicit relationships between vibration parameters and tactile attribute ratings.
The modeling is structured in three parts.First, due to the generalization of everyday life vibration into four excitation patterns with divergent vibration parameters (see II.A), the excitation pattern implied by a given tactile profile needs to be classified according to a classifier.Second, the relationships between defining vibration parameters and their most predictive tactile attributes need to be modeled.Thus, regression models were fit to the data of each of the four excitation patterns.Third, the resulting equation systems need to be solved for the tactile attributes to obtain the synthesis equations predicting vibration parameters from attribute ratings.Fig. 2 visualizes a schematic description of the resulting perceptual model for authoring plausible vibration.

C. Support Vector Machine Classifier
To predict the excitation pattern most likely resulting in a highly plausible vibration from a given tactile profile a classifier is required.Classification algorithms fall into four general categories: probabilistic, linear, prototype-based, and hierarchical [33].From the observations of II.A, it becomes clear that not all attributes are equally relevant to distinguish between excitation patterns, implying a decision tree classifier.Given the continuous character of attribute ratings and the likely complex class borders, a support vector machine (SVM) of the linear classifier category seems suitable, but cannot easily account for more than two classes.Therefore, a combination of the decision tree approach and the SVM approach was selected to form a cascaded SVM decision tree.
1) Transient vs. Non-Transient Excitation Patterns: The observations of II.A suggested that "fading" might be a suitable tactile attribute to discriminate transient from non-transient (stochastic and periodic, i.e., sinusoidal and AM-sinusoidal) vibration.However, the attribute "fading" alone did not enable a linear separation of these two patterns.Since all the impulse-like excitation pattern stimuli had low "uniform" as well as low "repetitive" ratings, a first SVM classifier was built with these two attributes.MATLAB was utilized to fit the hyperplane separating the two excitation pattern groups.The impulse-like excitation pattern can be classified from the ratings of "uniform" r u and the ratings of "repetitive" r r according to the equation: Fig. 3 shows this classifier.This classification has no classification error for the dataset.
2) Stochastic vs. Periodic Excitation Patterns: For the remaining three excitation patterns, a second classifier was constructed.The observations of the ratings of the attribute "uniform" described in Section II-A suggested that it is suitable to distinguish stochastic vibration from periodic i.e., sinusoidal and AM-sinusoidal vibration.By also integrating the attribute  "weak" the bandlimited white Gaussian noise excitation pattern vs. the two periodic excitation patterns and can be classified from the ratings of "uniform" r u and the ratings of "weak" r w according to the equation: Fig. 4 shows this classifier.This classification has 7% classification error for the dataset.
3) AM-Sinusoidal vs. Sinusoidal Excitation Pattern: Finally, a third SVM was assessed to classify the remaining sinusoidal excitation pattern vs the AM-sinusoidal excitation pattern.The rating patterns of "repetitive" observed in Section II-A implied this attribute to be suitable candidate for this classifier.However, "up and down" needed to be added as a predictor for this classifier, since "repetitive" is also used to describe low frequency vibration.Thus, the sinusoidal vs AM-sinusoidal excitation pattern can be classified from the ratings of "repetitive" r r and the ratings of "up and down" r d according to the equation: Fig. 5 shows this classifier.This classification has 18% classification error for the dataset.
Overall, the three classifiers have a good performance with low classification error.If two excitation patterns can potentially generate vibrations with very similar tactile profiles, a classification error would not necessarily be problematic in terms of plausibility.A suboptimal classifier may also be attributed to the perceptual similarity of some vibrations generated from two excitation patterns.Since, the goal of the synthesis model is the synthesis of any vibration that elicits a tactile profile approximating the expected tactile profile.Given the definition of plausibility, any vibration independent of the excitation pattern eliciting such a tactile profile is likely to be highly plausible.Thus, non-error-free classifiers should not necessarily negatively affect the overall synthesis model performance.

D. Regression Models
After the prediction of the most suitable vibration pattern for the vibration synthesis from a given tactile profile, a model needs to predict their vibration parameters accordingly based on the data of [19].While vibration parameters were accurately set and included no error, the average attribute ratings necessarily contain sampling error for practically feasible sample sizes.Linear least squares regression models assume weak endogenity of regressors, i.e., that errors should only be present in the dependent variable but not the independent variable [34].According to [35], there are two solutions for solving this problem: reversing, i.e., switching dependent and independent variables or obtaining a model of the inverse relationship and subsequently inverting the resulting equation.The reverse approach would simply violate the assumptions of linear regression and accept the consequences, i.e., attenuation bias in the estimated coefficients [36], that is greater than for the reverse modeling approach [35].The utilization of data transformations or multiple predictors can aggravate the problem [37].Thus, following the inverse approach, attribute ratings were first regressed on vibration parameters and subsequently the resulting equation systems solved for the attribute ratings.
The vibration parameter -attribute rating pairs assessed in a repeated measures procedure for 29 participants of [19] cannot be pooled for the regression modeling.The hierarchical clustering of ratings of vibration within participants would violate the assumption of independence of observations.Consequently, the estimated coefficients would have inflated type I error.Therefore, the ratings have to be averaged across the subjects to subsequently conduct the linear regression on mean ratings.Given the goal of model, being the synthesis of vibration that would on average be perceived as plausible, the inclusion of individual factors in the model is not necessary.Furthermore, the goodness of fit R 2 of the linear regression model offers an easily interpretable measure of model performance only on the population level and thus directly relevant to the intended use case.
Alternatively, a multilevel model or linear mixed-effects regression model can be extended from the linear regression model by taking into account the hierarchical clustering of ratings in participants by treating the relationship between vibration parameters and attribute ratings as a fixed effect and participants as a random effect.For the case of data with one single observation per treatment level per unit, a by-unit intercept only random effect is recommended [38].Since only one rating of each vibration was provided by each participant, a by-subject random intercept was included.The direct inclusion of the observations of all 29 participants in the linear mixed effects regression potentially results in more accurate estimates, i.e., reduced standard error compared to the linear regression.
Since the estimates of the linear regressions models were almost all significantly different from zero, the linear regression models are reported.The linear mixed effects model is additionally reported, for the case where the estimates are not significantly different from zero.MATLAB was utilized with the fitlm and the fitlme command to fit the linear regression models and linear mixed effects regression models.
For each excitation pattern, there are two to three defining vibration parameters.To form a well-defined, i.e., neither over-nor underdetermined equation system, the two to three most predictive regressands, i.e., tactile attributes were selected according to the observations in II.A for building the two to three regression models with the highest adjusted R 2 for the pattern.The equation systems formed by the two to three resulting regression functions needed to be solved by the vibration parameters to obtain the synthesis equations.However, to find an analytical solution, the regression model complexity (i.e., the predictors included) could not be increased freely to increase the model fit.To avoid having to fall back on numerical approximate solutions that introduce another source of error, the model complexity was only increased (i.e., adding vibration parameter predictors to the regression model of an attribute) while analytical resolvability remained possible.If the prediction of attribute ratings from physical parameters were the only goal of the regression models, more explanatory terms would have potentially been included in the regression models.
Logarithmic transformation is frequently applied to physical parameters to approximate linear relationships to perceptual quantities e.g., in psychoacoustics for loudness [39].Similarly, log-transformed acceleration shows an approximately linear relationship to perceived vibration intensity for seat vibration [27].Moreover, the JNDF [23], [24] is not a frequency independent constant offset but a constant fraction between reference frequency and frequency increment.Therefore, (carrier-, center-or resonance-) frequency f, bandwidth f b , and modulation frequency f m were also log-transformed.For the case of unmodulated vibration i.e., f m = 0 an offset of 1 was added.The regression models assessed for the four excitation patterns are reported in Table I and are explained in detail in the following subsections.
1) Models for the Impulse-Like Excitation Pattern: The 18 vibration parameter -tactile profile pairs of the impulse-like excitation pattern were utilized to fit three multiple linear regression models.The models predict the ratings of "fading", "weak", and "tingling" from the vibration level L, resonance frequency f, and decay constant α.All the estimates are at least statistically significantly different from zero except the intercept for the "tingling" model.This implies that when all the other predictors in the model are zero, the estimate of the intercept is not significantly different from zero.[40] argues, that a non-significant intercept can remain in the model for predictions, where not all the remaining predictors are zero.Since the vibration level is always considered to be above perception

TABLE II FIXED EFFECTS IN THE LINEAR MIXED EFFECTS REGRESSION MODEL OF THE ATTRIBUTE "TINGLING" FOR THE AM-SINUSOIDAL EXCITATION PATTERN
threshold, the level range is always far above zero and thus the intercept was included in the model.
2) Models for the Bandlimited White Gaussian Noise Excitation Pattern: The 22 vibration parameter -tactile profile pairs of the bandlimited white Gaussian noise excitation pattern were utilized to fit three multiple linear regression models.The models predict the ratings of "uniform", "weak", and "up and down" from the vibration level L, center frequency f, and bandwidth f b .Again, intercept for the model of "up and not significant but was still included for the same reasons explained for the impulse-like excitation.
3) Models for the Amplitude-Modulated Sinusoidal Excitation Pattern: The 30 vibration parameter -tactile profile pairs of the AM-sinusoidal excitation pattern were utilized to fit three multiple regression models.The models predict the ratings of "repetitive", "weak", and "tingling" from the vibration level L, carrier frequency f, and modulation frequency f m .Again, the intercept for the model of "repetitive" was not significant but was still included for the same reasons explained for the impulse-like excitation.
However, the predictor frequency for the "tingling" model is also not significant for the linear regression model.Thus, also a linear mixed effects regression with the terms utilized in the linear regression as fixed effects and participant as a random effect was calculated (see Table II).The estimate for frequency is the same for the linear regression model and the linear mixed effects regression model, but compared to the linear regression model (see Table I) the standard error of the predictor is greatly reduced and thus the estimate is significantly different from zero.Therefore, the frequency was included as a predictor in this "tingling" model.

4) Models for the Sinusoidal Excitation Pattern:
The 21 vibration parameter -tactile profile pairs of the sinusoidal excitation pattern were utilized to fit two multiple regression models.The models predict the ratings of "up and down", and "tingling" from the vibration level L, carrier frequency f.Again, the intercept for the model of "up and down" was not significant but was still included for the same reasons explained for the impulse-like excitation.

E. Synthesis Equations
The regression models of the attributes reported for each excitation pattern in Table I were combined into one equation system for each excitation pattern.Subsequently, they were solved analytically for the vibration parameters with the MATLAB solve command.The resulting synthesis equations i.e., the vibration parameters as functions of attribute ratings are reported for each excitation pattern.A goodness of fit R 2 on the dataset with 91 vibration parameter -tactile profile pairs of [19] described in Section II-A can be calculated by estimating the vibration parameters from the tactile profiles containing the attribute ratings and comparing the predicted vibration parameter values to the defined vibration parameters values.
1) Impulse-Like Excitation Pattern: The vibration parameters of the impulse-like excitation pattern can be predicted from the ratings of the attributes "fading" r f , "weak" r w , and "tingling" r t , according to: The vibration level L can be predicted with an R 2 of 0.84, the frequency f with an R 2 of 0.87, and the decay constant α with an R 2 of 0.92.
2) Bandlimited White Gaussian Noise Excitation Pattern: The vibration parameters of the bandlimited white Gaussian noise excitation pattern can be predicted from the ratings of the attributes "uniform" r u , "weak" r w , and "up and down" r d , according to: L = − 0.567r w + 140.0 log 10 (f ) = 0.0318r u − 0.0114r w + 1.19 log 10 (f b ) = 0.0449r u + 0.0209r d − 0.00548r w − 0.0109 (7) Since the lower cut-off frequency of the bandlimited noise should not be smaller than 1 Hz to avoid a static acceleration component in the signal, the bandwidth estimates were constraint depending on the center frequency by: The vibration level can be predicted with an R 2 of 0.71, the center frequency with an R 2 of 0.68, and the constraint bandwidth with an R 2 of 0.5.
3) Amplitude-Modulated Excitation Pattern: The vibration parameters of the AM-sinusoidal excitation pattern can be predicted from the ratings of the attributes "repetitive" r u , "weak" r w , and "tingling" r t , according to: L = 0.431r t − 0.24r w + 112.0 log 10 (f ) = 0.0362r t + 0.022r w − 0.241 The vibration level can be predicted with an R 2 of 0.9, the carrier frequency with an R 2 of 0.39, and the modulation frequency with an R 2 of 0.43.

4) Sinusoidal Excitation Pattern:
The vibration parameters of the sinusoidal excitation pattern can be predicted from the ratings of the attributes "up and down" r d , "tingling" r t , according to: To obtain real parameter estimates, these synthesis equations are only valid under the condition: Thus, the domain of the synthesis equations of the sinusoidal excitation pattern is constrained to approximately 88% of the t perceptual subspace including the attributes "up and down" and "tingling".This constrains the synthesis model from outputting sinusoidal vibration with very high levels at very high frequencies.Since such combinations are rarely encountered in everyday life scenarios, this shortcoming should not negatively affect the plausibility of produced vibrations for the majority of contexts.The vibration level can be predicted with an R 2 of 0.41 and frequency with an R 2 of 0.47 with these synthesis equations.

A. General Model Application
The application of the synthesis model for arbitrary scenes to generate vibrations to be presented in virtual reality can be inferred from Fig. 2. The synthesis model requires expected tactile profiles consisting of ratings of the six tactile attributes ("weak," "up and down," "tingling," "repetitive," "even" and "fading") as an input, as well as a duration of the vibration to be generated.For each segment, an expected tactile profile needs to be quantified.For this purpose the context is communicated to the user that determines the expected profile, e.g., by audio-visual presentation or even verbal description.The tactile attributes of [19] forming the profiles were selected to be easily understandable by laypersons and thus do not necessitate vibration presentation e.g., for providing a reference vibration.
From the expected tactile profile, the excitation pattern is estimated first.Subsequently, the respective vibration parameters of the excitation pattern can be predicted.Finally, a signal generator synthesizes the vibrations according to the predictions with the equations described in II.A.If the synthesis model classified an impulse-like excitation pattern for the event, the synthesized impulse-like vibration is placed the beginning of the event without any additional fade-in or fade-out.However, if another excitation pattern is classified, a short fade-in of 100 ms at the beginning and a short fade out to the end of the impulse event is applied to the synthesized vibration to avoid artifacts of the reproduction system resulting from the abrupt transition between vibration and no vibration.
The model as described in this proof of concept paper is applied to limited duration scene segments with quasi-constant tactile profiles describing one stationary vibration or one impulse-like vibration.For a successive segment with a different expected profile the model would be applied again to generate another vibration.However, the optimal transition between multiple segments would have been another factor potentially influencing plausibility.To focus on the general feasibility of predicting vibration from expected tactile profiles, only single segments were in the focus of this study.

B. Validation Stimuli 1) Scenes With Recorded Vibration:
To validate the synthesis model perceptually, an experiment was conducted to assess the plausibility of the synthesized vibration in the audiovisual contexts of a representative set of recorded vehicle scenes and compare it to the plausibility of the recorded vibrations in these contexts.Everyday life exposure to vibration happens most frequently in cars in the form of vertical seat vibration.Therefore, a representative set of audio-visual-tactile scenes with heterogeneous vibrations was assessed previously by the authors [20].For a middle class vehicle (Renault Scenic 1.6) the parameters speed, operating condition, and road surface were varied systematically to cover the most commonly encountered situations in everyday life.Twelve quasi-stationary scenes were recorded where the road surface and the operating condition was kept constant.Seven scenes of impulse-like events e.g., driving over manholes were also included.The scenes had a total duration from 4 s to 17 s.The impulse-like events contained in the 7 scenes lasted between 500 ms to 800 ms.Table III lists all scenes.
2) Scene With Synthesized Vibration: Synthesized vibration was created according to general model application description.For the recorded 19 vehicle scenes average profiles were obtained by audio-visual-tactile presentation to a group of 31 participants in [20] and thus were used as an input to the synthesis model (see Table III).
For the stationary scenes (1 to 12), the vibration was generated for the total duration of the audiovisual scene.For the scenes containing impulse-like events such as crossing a manhole ( 13Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Fig. 6.Spectra (FFT, 4096 samples, 50 % overlapping Hann windows) of the recorded vibration (magenta) and the vibration generated from the parameter estimates of the synthesis model (green) for each of the 19 vehicle scenes according to Table III.Fig. 7. Time signals of the scenes with the impulse-like events.The recorded vibration (magenta) was faded to zero 300ms before its beginning and after its end.The synthesized vibration (green) was generated only for the duration of the impulse-like event.
to 19), the tactile profiles refer only to the duration of the single impulse-like event, which was manually segmented.For scene 14, an impulse-like vibration was predicted and no fade-in or fade-out applied while for the other scenes with impulse-like events, another excitation pattern was predicted and thus a fade-in and fade-out of 100 ms was applied in line with III.A.One simple option would have been to cut the audio and video of the scene to the duration of the event.However, this would have resulted in total scene durations below one second and thus would have made it difficult for subjects to grasp the audiovisual context without having to repeat the scene multiple times.Instead, the vibration was only generated for the duration of the event and set to zero for the remaining duration of the audiovisual scene before and after the event.The vibration recorded for the scenes with impulse-like events covered the total duration of the scene.To not bias the results towards scenes containing vibration for the total duration, the vibration was also set to zero outside the duration of the event only faded in 300 ms before the impulse-like event and faded out 300 ms after it for each of these seven scenes.
The spectra and the time signals of the recorded compared to the synthesized vibration are shown in Figs. 6 and 7.For some of the scenes such number 4, the spectrum is surprisingly similar.For other scenes such as number 8, the spectrum of the synthesized vibration deviates quite drastically from the spectrum of the recorded vibration.

C. Experimental Setup
The audiovisual-tactile stimuli were presented in a virtual reality lab, which can reproduce video, audio and vibration (Fig. 8).To enable the wide frequency range of reproduction in the study [19], a hydraulic hexapod motion platform was Fig. 8. Experimental setup for presenting the multimodal (audiovisual-tactile) scenes.Vertical vibration were presented with a hydraulic platform up to 7 Hz (10 dB (SL)) to 15 Hz (36 dB (SL)).Audio and video were presented via a wave field synthesis audio system and a projector.utilized for low frequency components and an electrodynamic shaker was utilized for high frequency components.Both systems operated simultaneously to produce vertical vibration to subjects placed on a Recaro Racing seat.Both reproduction systems were driven in their respective linear ranges.To facilitate the generalization of the subsequent results, a calibration of the reproduction system in the experimental condition is necessary [41].This calibration was run for each subject individually to account for differences in body characteristics, e.g., weight resulting in an individual transfer function.It was realized with an FIR filter of the inverse transfer function as described in [42].
Additionally, a projector was used to present the videos of the scene for the multimodal mode.The audio was presented as focussed sound sources at the ears of test subjects with a wave field synthesis system with 464 speakers for the multimodal mode.A "rate now" subtitle was presented for the duration that the impulse-like event was presented for the seven impulse-like scenes.

D. Experimental Design
In order to avoid biasing the expectations on a scene, participants always rated all scenes with synthesized vibration before the scenes with recorded vibration for the plausibility of the vibration within the audiovisual context of the scene.Within the two blocks, scenes were completely randomized.
As argued in Section I the place illusion and the plausibility illusion both determine, if a user would perceive and interact with the virtual environment as with a real environment.For measuring the place illusion, which is related to presence and immersion enabled by the technical capabilities of the reproduction systems, presence questionnaires have been suggested by [43] and [44].For the plausibility illusion, depending on the content of the scene and whether it relates to the user expectations, these questionnaires were not sensitive to changes in the coherence of the content of the scene [45].Similarly, the heart rate, skin conductivity and skin temperature were also unsuitable measures [45].However, [45] and [15] suggest that direct judgements of "plausibility" on rating scales are suitable measures of plausibility.Due to the conceptual similarity of plausibility judgements as a measure of match between elicited and expected perceptual profiles [15], they can be compared to quality judgements [14].Quality assessments are frequently assessed as direct ratings on Likert scales.Direct judgements of "plausibility" on rating scales have indeed been utilized by [46], [47] and [20] for the haptic modality.
As argued in Section I, for the majority of use cases of virtual reality applications the real environment is not available for direct comparison.Therefore, a direct A:B comparison between the recorded and the synthesized vibration is not necessarily representative of the user experience.Since, the user has to compare the tactile properties of the presented vibration to the tactile properties expected from the audiovisual context, his ratings resemble absolute judgements.Therefore, participants rated the "plausibility" on a rating scale.Compared to a discrete 5-tick scale, ratings on a 100-point scale approximates a continuous variable for the statistical evaluation of the resulting data.Equidistant verbal anchors "not at all" (0), "slightly" (25), "moderately" (50), "very" (75) and "extremely" (100) according to [31] were added to the scale to facilitate rating scale utilization by laypersons.

E. Participants
A total of 22 native German speakers (13 male, 9 female) with an average age of 30 years (16 to 61 years) rated the 19 scenes with synthesized vibration and the 19 scenes with recorded vibration.The participants had either no or little experience with the simulator.The study was conducted with the understanding and written consent of each participant.The study was approved by the Ethics Committee of the Technische Universität Dresden (SR-EK-111032020-Amendment) and conducted in line with guidelines of the Helsinki Declaration.

F. Results
Fig. 9 shows the mean plausibility ratings and 95% confidence intervals of the vibrations synthesized by the regression model in comparison to the ratings of recorded vibrations.The mean plausibility difference between synthesized and recorded vibration over all scenes and attributes is just 4 points on the 100point scale.To investigate the similarity of the ratings between synthesized and recorded vibration statistically an ANOVA was conducted in SPSS.Violations of sphericity were accounted for with the Greenhouse-Geisser correction.The factor scene showed a highly significant effect (F(7.729,162.303)= 11.216,p < 0.001).The factor model did not show a significant effect (F(1,21) = 3.832, p = 0.064) on plausibility rating for the regression model vs. recorded.The interaction of the factor model and factor scene showed a significant effect (F(7.851,164.876)= 3.151, p < 0.05), implying in combination with the nonsignificant effect of vibration type a crossover interaction, i.e., that for some scenes synthesized vibration was more plausible and for other scenes recorded vibration.To investigate the similarity of the ratings between synthesized and recorded vibration statistically, a paired contrast was calculated in SPSS.This contrast between the two vibration types has an insignificant (p = 0.064) mean effect size of 4 points difference.Since the 95 % confidence interval ranges only from 0 to 9 points, the mean difference is unlikely to exceed one tenth of the 100-point scale.
These results imply that the performance of the excitation pattern classification as well as the goodness of fit of the regression models is sufficient for generating plausible vibration.The stationary scenes 1 to 6 featured broadband vibration in the recorded and in the synthesized vibrations, while scenes 7 to 12 featured narrowband vibration as evident from Fig. 6.No significant difference between recorded and synthesized vibration also implies, that the greater spectral differences did not lead to a degradation of plausibility.There seems to be a slight trend towards lower plausibility for the recorded and synthesized vibrations of the impulse-like scenes.This is possibly caused by zeroing recorded and synthesized vibration before and after the single impulse-like event, despite audio and vibration being present over the whole scene duration.
Overall, these results suggest that for practical purposes the synthesis model can generate vibrations that are equally plausible as recorded vibration.This confirms the hypothesis of II.A that complex real vibration are sufficiently perceptually similar to the utilized basic excitation patterns and thus are a reasonable abstraction for perceptual model-building.

IV. DISCUSSION
Building onto the tactile design language for standardized communication about the sensory perceptual space of vibration of [19] and the assessed dataset with vibration parameter -tactile profile pairs, a novel perceptual model-based authoring approach for plausible vibration was suggested.The approach is based Fig. 9. Mean plausibility ratings and 95% confidence intervals of vibration synthesized by the regression models (green) vs. recorded vibration (magenta) in the context of their respective audio-visual scenes according to Table III.
onto the formalization of plausibility as a measure of congruence between an expected tactile profile and a tactile profile elicited by vibration.The quantitative assessment of expected tactile profiles (consisting of ratings of the tactile attributes "weak," "up and down," "tingling," "repetitive," "even" and "fading) expected for a given (audiovisual) context enables a subsequent prediction of vibration eliciting the expected tactile profile.
For a given tactile profile, the model first predicts one of four excitation patterns (impulse-like, bandlimited white Gaussian noise, AM-sinusoidal, sinusoidal) representing the range of everyday life experience of vibration with a set of cascaded SVM-classifiers.Subsequently, linear regression models were fitted, that predict the characteristic tactile attributes for each excitation pattern from their defining vibration parameters.A set of synthesis equations, i.e., vibration parameters as functions of tactile profiles were derived from these regression models.
The previously quantified tactile profiles of the 19 recorded audiovisual tactile scenes [20] were input to the synthesis model to generate vibration.Finally, the synthesized and recorded vibrations were presented to participants in the recorded audiovisual context in an experiment.The results demonstrated that the synthesized vibrations are quasi equally plausible as recorded vibration.Together these finding demonstrate a proof of concept of the novel approach for authoring plausible vibration for virtual environments.
The presented novel approach does not necessitate any vibration recordings of a corresponding real environment.Compared to the authentic authoring approach, it is thus potentially more efficient.Moreover, the approach can also be utilized for scenes with inaccessible real counterparts or even scenes without a real counterpart at all.The extraction from the optical signal of a scene [8], [9], [10] or the acoustical signal of scene [4] is also theoretically applicable for such scenes.However, they only work for scenes with optical or acoustical signals featuring a high correlation to the vibration signal [8].Since they do not consider user expectations, they cannot ensure the synthesis of a vibration plausible in the audiovisual context, thus requiring manual optimization of the vibration for optimal experience.This problem may even be encountered for some scenes when simply reproducing vibration recorded in the real counterpart, while manually designed vibration conveying an impression coherent to the audiovisual context would be judged as plausible [11].By building directly onto user expectations for the scenes context, the presented approach can ensure that, tactile expectations on vibrations are met and thus little to no post-synthesis quality control or optimization should be required.
The disadvantage of the approach is that it necessitates human judgements to assess tactile profiles.However, by utilizing layperson understandable tactile attributes instead of physical parameters, it does not require experts for the design.Furthermore, the approach offers a systematic strategy for finding optimally plausible vibration, eliminating inefficient trial and error strategies that require the presentation of iteratively optimized vibration in user studies.Since standardized vibration reproduction hardware is currently not widely available, crowdsourcing user studies with vibration e.g., for quality control of the synthesized vibration is not possible and thus must be conducted in lab settings.Recently, model-based image generation from verbal queries is becoming ubiquitous, highlighting the feasibility of verbal query based visual authoring.Similarly, the presented model is capable of tactile authoring of vibration from a verbal query, consisting of the six tactile attributes (e.g., "very tingling vibration").Since no vibration reproduction is required for the definition of plausible vibration parameters fulfilling user expectations, the user expectations can be accounted for before the design.Thus, online crowdsourcing platforms can be utilized by providing an audio-visual context or even simply a verbal description of the context.Since layperson participants provided the expected tactile profiles from which the model predicted the plausible vibrations, the approach enables users to author their own vibrations for the future haptic metaverse vision.
Vibration design mediated by databases of specific contexts with associated vibration has been previously suggested [16].By generalizing contexts according to their tactile attributes not every context has to be reassessed [20].Due to the discrete mappings between a tactile profile and its associated vibration, only a finite number of vibrations can be generated.Thus, it is difficult to account for nuances and ensure optimal plausible vibration.One suggested solution is tuning the vibration items in the database to approximate the required tactile profile [18].However, the relationships for tuning the tactile profiles with associated vibration parameter changes were not generalizable across the vibration items of the database.For the reported synthesis model, it was demonstrated for a representative scene set that a plausible vibration can be directly predicted for a given arbitrary tactile profile without the necessity of tuning.
Even though the presented novel approach seems promising, it is necessary to investigate several research questions regarding its generalizability.The model builds onto four generalized excitation patterns, which approximate vibrations caused by the physical excitation processes according to [22] (Periodic mechanical processes i.e., sinusoidal vibration, correlated periodic excitation i.e., amplitude modulated vibration, superimposition of uncorrelated sources i.e., variable bandwidth narrowband noise, and impacts) underlying most haptic interactions instead of context specific recorded vibration.These excitation patterns span a wide range of temporal and spectral variation of vibration and should cover a wide range of scenarios with similar underlying physical excitation process.The validation demonstrated that such simple vibrations are sufficiently similar to complex real vibrations from a perceptual standpoint for the investigated driving scenarios.We can assume that the capabilities of the tactile receptors mediating the perception of vibration and thus explaining the perceptual similarity of these vibrations are independent from the scenario at hand.However, the sufficiency of this abstraction into excitation patterns should be confirmed in more validation scenarios.
Since this article intends to demonstrate a proof of concept of the novel authoring approach, the investigations were limited to segments with quasi-constant tactile profiles for the purpose of this study.For many scenarios, vibration could be changing over time.Such changes would need to be approximated by a sequence of multiple expected tactile profiles with a different vibration generated by the model each.To avoid unwanted artifacts strategies for transition between excitation patterns should be investigated.However, because of the quasi-continuous predictions enabled by the model, it could be easily extended to account for perceptual profiles changing over time.For gradually changing expected profiles, the regression models can produce gradually changing vibration.Alternatively, vibrations of two successive segments are simply crossfaded, e.g., for the transition between vibrations of two different excitation patterns.
Further generalizability to other directions or even locations of excitation is another direction worthy of further investigations since the presented model is built on the dataset of ratings of vertical seat vibration from [19].The investigations of [48] suggest that multiaxial vibration can be transformed to perceptually similar vertical vibration especially if the excitation is exclusively present in the frequency range dominated by the Pacinian channel.Therefore, other directions of excitation would likely lead to similar perceptual profiles compared to vertical vibration.Furthermore, a comparative investigation between the seat location and hand location of vibration presentation revealed largely similar tactile profiles [49].Thus, the presented approach seems to be generalizable to other locations of vibration introduction beyond the seat.
The sometimes-drastic spectral differences between recorded and synthesized vibrations still producing quasi-equal plausibility ratings might imply a sensory tactile layer of abstraction.Thus, perceptual substitution of vibration signals might be possible enabling the use of simpler e.g., narrow bandwidth reproduction systems beyond what was demonstrated by [50] or [51].This layer of abstraction might also guide a perceptual coding strategy that only necessitates the transmission of tactile attribute ratings instead of full vibrations signals.Finally, it is possible to apply this approach to the domain of product design.In order to ensure a high quality of the future product, the customer expectations need to be fulfilled.The presented approach would allow assessing tactile customer expectations and deriving vibration parameter specifications, even before a prototype is built.

V. CONCLUSION
The haptic metaverse envisions user involvement for creating virtual reality content such as vibrations.We demonstrated the feasibility of a perceptual model for authoring vibration directly from user expectations.To be applicable to a wide range of scenarios, vibrations underlying haptic interactions are generalized into basic excitation patterns.The generative model builds onto a dataset containing these excitation patterns and their associated elicited tactile profiles.From the user provided expected tactile profiles the model synthesizes highly plausible vibration.The validation demonstrates equal plausibility compared to recordings for vehicle scenarios.Since the approach builds directly onto user expectations, it does not require recordings and reduces the necessity of post authoring quality control investigations.

Manuscript received 20
March 2023; revised 3 August 2023; accepted 18 September 2023.Date of publication 25 September 2023; date of current version 20 June 2024.This work was supported in part by German Research Foundation (DFG, Deutsche Forschungsgemeinschaft) as part of Germany's Excellence Strategy-EXC 2050/1-Project ID 390696704-Cluster of Excellence "Centre for Tactile Internet with Human-in-the-Loop" (CeTI) of Technische Universität Dresden, and in part by German Research Foundation (DFG, Deutsche Forschungsgemeinschaft) AL1473/7-1.This paper was recommended for publication by the Guest Editors of the Special Issue on Haptics in the Metaverse.(Corresponding author: Robert Rosenkranz.)

Fig. 2 .
Fig.2.Input and output of the synthesis model and its potential integration into a simulator for virtual environments.

Fig. 3 .
Fig. 3. SVM classification of vibration into transient and non-transient excitation patterns by the attributes "uniform" and "repetitive".

Fig. 4 .
Fig. 4. SVM classification of vibration into stochastic and periodic excitation patterns by the attributes "uniform" and "weak".

Fig. 5 .
Fig. 5. SVM classification of vibration into AM-sinusoidal and sinusoidal excitation patterns by the attributes "repetitive" and "up and down.

TABLE I LINEAR
REGRESSION MODELSAuthorized use limited to the terms of the applicable license with IEEE.Restrictions apply.

TABLE III RECORDED
[19]ES OF TYPICAL SITUATIONS IN VEHICLES WHERE A DRIVER IS EXPOSED TO WHOLE-BODY VIBRATION AND THEIR TACTILE PROFILES FROM[19]