Data-Driven Haptic Modeling and Rendering of Viscoelastic Behavior Using Fractional Derivatives

Data-driven modeling and rendering is a general approach in haptics aiming to provide highly accurate haptic perceptual experiences simulating complex real physical dynamics, such as deformable or textured objects. A prevalent problem in the present methods for data-driven haptics is that the computational cost for modeling grows rapidly, even becoming intractable, as the interaction complexity or the number of data increases. This paper proposes one data-driven method featured with greatly improved computational efficiency for modeling viscoelastic deformable objects. This advantage is enabled by the use of fractional derivatives for modeling features and regression forests for data-interpolation models. For the benchmark of normal interaction on deformable objects, we describe a computational framework for data-driven haptic modeling and rendering. Its performance is validated by physical experiments for modeling accuracy and cost and a perceptual experiment for the similarity between real and virtual objects. The experiments demonstrate that our method offers highly realistic haptic perceptual experiences with markedly better modeling cost (at least ten times) than other state-of-the-art methods.


I. INTRODUCTION
This paper revisits the fundamental research problem of datadriven haptic modeling and rendering, nicknamed ''haptic camera'' [1] or ''haptography'' [2], for viscoelastic deformable objects. As illustrated in Figure 1, this approach captures the complex physical dynamics of a real object when a human finger touches and explores it, expresses the real object's dynamics by a black-box model from the input (finger position and its derivatives) to the output (response force), and then simulates the black-box model to present the identical haptic experiences to a user in a virtual environment.
The associate editor coordinating the review of this manuscript and approving it for publication was Yangmin Li .
The black-box models for input-output interpolation are built empirically by applying appropriate machine learning techniques to the captured real interaction data.
The data-driven approach in haptics has advantages for objects exhibiting complex physical phenomena, e.g., deformable or textured objects. Such physical properties are difficult to represent accurately using the alternative of physics-based modeling and simulation, especially because of the strict real-time computation requirement of haptic rendering. The rule of thumb is that update rates of no less than 1 kHz are necessary to provide stable haptic interaction with a stiff virtual object using a force-feedback interface. In fact, the faster the update rate is, the better the stability of haptic interaction becomes [3]. Besides, the physics-based methods FIGURE 1. While a user haptically interacts with a real deformable object, the user's interaction motion and the object's response force are recorded, as if taking ''haptic photographs''. The recorded data is processed and used to make a blackbox computational model, e.g., a random forest model, between the motion (input) and the force (output). This data-driven model is used to recreate the interaction dynamics of the real object with a haptic interface in a virtual environment.
often require an exhaustive identification or tuning procedure of their model parameters for the faithful simulation.
A critical issue present in data-driven haptics for deformable objects is that the amount of real interaction data and its processing time, i.e., the modeling cost, easily becomes intractable when the interaction patterns are complex. Examples include an object pressed by multiple fingers and a case exhibiting significant inhomogeneous behaviors depending on the contact position; see Section II. In this paper, we present a new data-driven approach based on fractional derivatives (FDs) to improve the modeling efficiency.
Given a mathematical function, a fractional derivative is its derivative of arbitrary order (not necessarily integer); see Appendix A. FDs have been used to describe complex mechanical phenomena, where viscoelasticity is one of the main targets (Section II). In our framework, while pressing and releasing a real viscoelastic object using an instrumented probe, we measure the probe's normal position to the object surface, which represents the extent of deformation, and the response force generated by the object (Section III). Then, we compose an input feature vector for machine learning models using FDs of the normal position (Section IV). Each FD has an order between 0 and 1, corresponding to the position and velocity. This is one of the crucial differences from the previous data-driven methods, which used either the position and velocity pair or the current and several past position samples as input features (Section II). As interpolation models to predict the response force, we test both regression forests (RFs) and radial basis functions (RBFs). The computational performance of our new approach is evaluated experimentally in comparison with other state-of-theart methods. We employ the trained models for rendering deformable objects in a virtual environment (Section V). Finally, the perceptual accuracy of our data-driven framework is assessed by a human-subject experiment (Section VI.
Conclusions are summarized in Section VII with a plan for future work.
Contributions of our work consist of: (1) the first application of fractional derivatives to haptic data-driven modeling and rendering of deformable objects; (2) the proposal of a regression forest-based interpolation model leading to substantially reduced modeling cost; and (3) the physical and perceptual validations of the modeling and rendering performance of the developed methods.

II. RELATED WORK
The recent advances in machine learning has allowed data-driven modeling to be actively used for describing complex phenomena in many fields of science and engineering, such as physical modeling, material modeling, weather forecasting, and financial forecasting [4]. In physics, researchers have developed physics-informed machine learning models trained on the additional information more than the physical laws [5]. For example, Hatfield et al. [6] has highlighted the importance of data-driven methods for high-energy density physics. In [7], the authors have used deep learning frameworks to solve nonlinear partial differential equations. Cenedese et al. [8] has developed a data-driven modeling and prediction method for non-linear dynamic systems. Researchers have also started employing the data-driven methods in material science Pollice et al. [9]. Forecasting or predicting is also one of the promising topics in data-driven modeling [10]. For example, the data-driven methods are employed for predicting electricity consumption of a building [11] and wild fire forecasting [12].
Likewise, data-driven haptics is concerned with model-free methods that employ nonparametric machine learning techniques to learn input-output mapping functions. The aim is to model the complex physical behaviors of real objects and simulate them in real time for rendering in a virtual environment. The data-driven approach has been successfully employed for deformable objects [13], [14], [15], [16], [17], surface textures [18], [19], [20], [21], [22], and thermal interaction [23]. The review in this section focuses on data-driven haptics for deformable objects.
Höver et al. [13] were among the first who applied the data-driven approach to accounting for the viscoelastic behavior of a deformable object. They used the position and velocity of a haptic tool as the input features, as in the physics-based models. They employed an RBF-based interpolation method to predict the output of response force. This work was extended to modeling slip and inhomogeneous interactions on deformable objects [14], [15]. Yim et al. [16] presented a complete data-driven framework for inhomogeneous objects handling both viscoelastic and frictional responses in a unified way. For friction identification, they estimate sliding yield surfaces and anisotropic friction cones that depend on the contact position from the measured data. The frictional behavior during rendering is simulated using a virtual proxy, and its position is updated using the sliding yield surfaces while crossing the stick and slip regions. The position-dependent deformation of the inhomogeneous object is modeled using an RBF formulation that includes an additional input variable of virtual proxy. This notable improvement, however, significantly increases the amount of data required and the modeling complexity. This often makes the RBF training intractable for complex inhomogeneous real objects.
This problem of high computational cost for modeling has been addressed by a few groups. Sianov and Harders [17] proposed a feature-based learning scheme for RBF models to reduce the dimensionality of datasets. They select features using greedy kernel selection, which effectively reduces the modeling time for large datasets. This method was successfully applied to viscoelastic objects explored by two fingers.
As an alternative to RBF, we previously proposed using regression forests, a well-known nonparametric machine learning technique, to model viscoelastic deformable objects [24]. In this method, the input features consist of the current and past position samples. The RF-based model outperformed RBF-based models in terms of prediction accuracy, amount of training data, and computational time. However, when tested for rendering virtual objects, we found that the RF method did not provide stable interactions. We observed that the RF model often simulated a negative damping behavior, which essentially accumulates the energy during haptic interaction and can make the rendering unstable [3], [25]. This experience motivated us to explore better input features for RF-based models.
Recently, Abdulali et al. [26] proposed data-driven methods for hyper-viscoelastic material. In this work, deformation data are captured using external cameras. The relationship between applied force and deformation is represented by a highly efficient FEM (finite element method) model. Then, a real-time predictor is built from the FEM model and used for haptic rendering. In their next paper [27], the authors extended the framework to model the plasticity in deformation using inverse reinforcement learning.
In this paper, we use fractional derivatives to define input features. In mechanics, FDs have long been used as an alternative method to describe viscoelastic behaviors [28], [29]. The key component is springpot (Figure 2), which can be combined with other traditional mechanical elements to model the complex dynamics of many real world objects [30]. FDs have been applied to many other research problems to explain the viscoelastic property of deformable objects [31], [32], elastomers [33], [34], and tissues [35], [36], [37]. FDs have recently been introduced to haptics. Caldiran et al. [38] studied the perceptual properties of the springpot model when rendered with different amplitudes and phases in the frequency domain.

III. DATA COLLECTION
In this paper, we focus on modeling of the viscoelastic deformation behavior of a real soft object when it is pressed and released in the normal direction. The aim is to provide highly realistic haptic responses by simulating the model and rendering the results using a force-feedback interface. Our experimental setup is designed accordingly. Figure 3, the experimental setup for data collection consists of a force-feedback device (Omega 3.0, Force Dimension), a load cell (DBCM-2kg, Bongshin), and a deformable object. The software is implemented using an open-source haptic rendering library, CHAI3D. The load cell is attached to the end-effector of the force-feedback device. The other end of the load cell has a tip of 6 mm diameter for interaction with the soft object. The tip is always located at the center of the object's top surface for data collection. The deformation depth is measured by the haptic device, and the force is by the load cell. The data (position and force) is updated at a sampling rate of 1 kHz.

As shown in
For haptic rendering, the load cell is replaced with an aluminum handle that has a very similar size and weight to the load cell. The upper image in Figure 4 compares the sensor and the handle. The gravity compensation constant included   in CHAI3D for force rendering is adjusted according to the changed mass of the end effector.
We consider five different silicone materials for data collection. The materials are Ecoflex 00-10, Ecoflex 00-30, Ecoflex 00-50, Dragon Skin FX, and Dragon Skin 10, from the softest to the hardest, all from Smooth-On, Inc. Their detailed information is available in Table 1. From each silicon material, a cube with an edge length of 3.5 cm is molded and used in this work; see the lower image in Figure 4.

B. TRAINING DATA
Our data collection procedure is automated using the force-feedback device. An inverted cosine signal in (1) is commanded as a force control signal to the device's tip: where a is the target (peak-to-peak) amplitude in N and ν is the indentation frequency in Hz, respectively. To characterize the object's rate-dependent deformation property for users' general exploratory behavior, we choose three amplitudes, a ∈ {2, 6, 10} (in N), and nine frequencies, ν ∈ {0.2, 0.25, 0.33, 0.5, 1, 2, 3, 4, 5} (in Hz). For each pair (a, ν), we collect the interaction data for a single indentation cycle. The output force is controlled by position-derivative (PD) control to ensure stable data collection. We collect interaction data for each object in Figure 4 using the 27 input force profiles (3 amplitudes × 9 frequencies). The total length of the training dataset is 48.85 s. An example of the collected data is provided in Figure 5. These datasets are used for training interpolation models (Section IV).

C. VALIDATION DATA
We collect another dataset for each object to validate the trained interpolation models. The validation set must include sufficiently complex and general conditions that can be interpolated from the range of the training data. We generate a complex force signal to indent the object by computing for t ∈ [0s, 15s]. a i and ν i are randomly chosen between 0.2 N and 1 N and between 0.2 Hz and 5 Hz, respectively, for i ∈ {1, 2, · · · , 5}. The values of f i (t) after the last zero-crossing before 15 s are zero-padded to make any sum of multiple f i (t) terms end with zero. Then, we compute The five element signals are added in (3). The result is normalized in (4) by the maximum force amplitude (10 N) used for training data collection. We make ten validation input force signals using (2)-(4) and then collect interaction data for 15 s for each input signal. We apply a zero-phase moving average filter (window size 25) to smooth both the measured position and force data. See Figure 7 for examples.

IV. MODELING
Using the collected interaction data, we learn a nonparametric mapping from the input (position) to the output (force). In this paper, we propose new input features using fractional derivatives, which is free from the rendering instability problem for RF models aforementioned in Section II.

A. INPUT FEATURES
The input features are derived from the fractional derivatives of deformation position. Here, we use the sampled sequence of position x[n] and force f [n] from the continuous position x(t) and response force f (t), where n is the discrete time index. We make an input feature vector as follows: where D r x[n] denotes the r th -order fractional derivative of x[n] and its order The training dataset of each deformable object includes the 27 position-force signal pairs measured for the 27 input force profiles of different amplitudes and frequencies (Section III). We represent them by x k [n] and f k [n], respectively (k = 1, 2, · · · , 27). Then, we compute X k [n] using (5): for all n ∈ {1, 2, · · · , N k } where N k is the number of the recorded samples in x k [n]. For each n, X k [n] (input feature) is paired with f k [n] (output force). For compact notation, we stack X k [n] and f k [n] for all n, such that Finally, X k and f k are stacked over k, i.e., all the 27 training signal sets, such that where X * ∈ R (10×N k ×27) and f * ∈ R (1×N k ×27) . Consequently, (X * , f * ) constitutes the full training dataset.

B. MODEL TRAINING
For each object, we estimate an interpolation function , which predicts the response force f [n] from the FD feature vector X[n]. To this end, we learn a regression forest model on the training dataset prepared for the object. An RF consists of many decision trees and can be used for regression [39]. As illustrated in Figure 6, an input vector is fed to many decision trees in parallel, and their output values are averaged to obtain the final output; refer to [39] for more on RFs.
For RF model training, we choose the following parameters: ten orders of FDs in (5) = {0.05, 0.10, · · · , 0.50}, the number of decision trees = 100, and the stopping criteria = minimum five samples at leaf nodes. We use FDs of many different orders to fully utilize the strength of RFs for automatically searching for meaningful features in a large feature space. Each of the ten orders explicitly represents the different time-dependent behavior (see Figure 19 in Appendix A) so that RF training can choose the best rate-related features. For implementation, we use the source codes available in [40]. The resulting interpolation models are denoted by RF-FD models.
For comparison, we also train a radial basis function on input features consisting of FDs: To choose the best two orders for each object, we train RBF models with all pairs of orders chosen from the ten orders used for the RF-FD models. Then, we select the pair that results in the lowest root mean squared error (RMSE) for the validations datasets. The chosen FD orders are shown in Table 2, and the resulting models are called RBF-FD models. Note that including more FD terms in the input features significantly increase the computation time for RBF model training. The computational results of such cases are not compared in this paper. Lastly, we train another RBF model on the conventional input features of position and velocity: The velocity v[n] is estimated using the first-order adaptive windowing method [41]. These models are named RBF-PV, as an implementation of the previous state-of-theart model [13] for comparison. All the models are trained on a reduced dataset randomly sampled from the original training dataset, as in [13]. We use only 20% of the training data after repeated tests. The size of the training dataset critically affects the time required for model training. The 20% ratio of the training data to use was determined by empirical tests. Using larger ratio tended to make training of RBF models infeasible.
The two RBF models, RBF-FD and RBF-PV, use the following settings for training: the number of kernel points = 100 and the cubic spline interpolation method. These numbers, as well as the number of trees for RF models, were chosen considering the update rate of haptic rendering of the trained models. As the model size increases, the computation time of rendering becomes longer, which degrades the stability of haptic rendering.

C. MODELING PERFORMANCE
Examples of the force curves predicted by the RF-FD, RBF-FD, and RBF-PV models for all the objects are presented in Figure 7 with the measured force curves. They are chosen from the modeling results of the ten validation input profiles (Section III-C) to represent the average performance. The lower panels in Figure 7 show the errors between the measured and predicted force curves. Generally, the output force profiles closely match the measured force responses for all of the three data-driven models. The absolute prediction errors are less than 0.4 N for all the objects, except Ecoflex 00-10 (the softest material). For Ecoflex 00-10, the absolute errors sometimes increase to around 1 N.
We consider the absolute percentage error (APE) as the difference between the measured and predicted forces divided by the measured force. For perceptual similarity, the APE should lie below the just noticeable difference (JND) of human force perception. Figure 8 presents the APE vs. the measured force for each object's trained models. Each plot also shows the JND curve of force perception for comparison. The JND curve is initially very large and then rapidly decreases as the reference force increases, finally plateauing to the constant Weber fraction (10%) [42]. The APEs for all the models lie below the JND curve for the reference force larger than 2 N for all the objects, which is generally acceptable [26], [43]. Among the three models, the two FD methods show lower APEs than RBF-PV.
For each object, we also compute the RMSEs averaged over the ten validation datasets. The results are shown in Figure 9. RF-FD and RBF-FD, trained on the fractional derivatives, result in slightly lower RMSEs than RBF-PV   Last, we report the modeling computation time for both feature computation and model training. The measurement  was performed on a regular PC (Windows 10, i7-11700k CPU, 32 GB memory) using Matlab. Results are shown in Figure 10. The computation times were similar regardless of the real object for all three models. Their means were 11.9 s, 126.7 s, and 126.7 s for RF-FD, RBF-FD, and RBF-PV, respectively. Therefore, the RF-FD method requires less computation by more than ten times than the two RBF methods. The actual modeling time will decrease in proportion to the number of surface contact points when we extend our modeling approach to inhomogeneous objects as in [16].
The above results indicate that RF-FD provides similar accuracy to the two RBF methods at greatly superior modeling cost. Note that the prediction accuracy and computation time can be further improved if the parameters are optimized per object. For example, Figure 11 shows the variable importance of the ten FD features obtained from the RF training for the five real objects. A variable importance is a relative indicator of how much information the corresponding feature provides to a random forest. The plots show that fractional derivatives of low orders are dominant for hard objects (low viscosity), whereas those of medium orders (0.3-0.5) are more important for soft objects (high viscosity). Including higher order terms close to 1.0 is not necessary for the objects tested, although it may be for highly viscous objects. We may further improve the prediction and/or computation  performance by using only the features of high importance for each object.

V. HAPTIC RENDERING
For haptic rendering, we recreate the deformation dynamics of a real object by simulating its data-driven model and then generating the response force. To this end, the end-effector of the haptic device is replaced by the handle shown in Figure 4.
Our haptic rendering program is implemented under Microsoft Windows using C++ and CHAI3D, similar to the data collection program. Since we train RF and RBF models using MATLAB, the resulting models are loaded into the haptic rendering program. A loader file for RF models includes all information required to construct and simulate a regression forest. Given an input feature vector, all decision trees in the RF are traversed, and their results at the leaf nodes are averaged to determine the final output force. A loader file for RBF models delivers information about the centers of radial basis functions and their weights. Using these values, our program computes the rendering force for an input feature vector of RBF models.
The overall procedure of haptic rendering is depicted in Figure 12. The position of the haptic tool is denoted by x. As shown in Figure 12a, the past position samples for 171 ms are queued in a vector. It is convolved with a fractional derivative filter of the same length to compute fractional derivative features for RF-FD and RBF-FD models. The FD filter can be precomputed because our models use the fixed fractional orders (0.05, 0.1, · · · , 0.5), fixed time difference (1 ms), and limited length (171); see Appendix A. The FD features are input to a data-driven model (RF-FD or RBF-FD) to compute the virtual model force f M . For RBF-PV computation, shown in Figure 12b, we use the tool velocity v provided by the haptic device's driver. Its output is smoothed using an exponential moving average (EMA) filter with the weighting factor α for the current term of 0.7. These position and velocity are the input to an RBF-PV model, which calculates the output force f M .
The data-driven model output f M is used as the input to a virtual coupling algorithm to improve rendering stability, as depicted in Figure 12c. Its computational steps are specified in Algorithm 1 1 . In the pseudocode, x proxy , v proxy , and a proxy denote the virtual proxy position, velocity, and acceleration, respectively. x proxy is initialized with NULL before the initial contact. f proxy is the force applied to the virtual proxy to update its movement, and f VC is the force output of the virtual coupling. t is the sampling time. The algorithm simulates the simple dynamics of a virtual spring-damper system connecting the device position and the virtual proxy. The virtual proxy is treated as a quasi-static point mass, and its movement is simulated accordingly. The algorithm returns the coupling force between the two points. We use the following parameters: spring constant k = 1.5 N/mm, damping coefficient b = 20 Ns/mm, and mass m = 50 g, which are manually tuned. The output force f VC of virtual coupling is smoothed further by an EMA filter with α = 0.7. This final output force f is sent to the force-feedback device for haptic rendering. The effects of virtual coupling and EMA filtering are illustrated in Figure 13 using an example.  The above procedure is executed repeatedly in a dedicated thread. Table 3 shows the three data-driven models' update rates measured using a regular PC (Windows 11, i7-11700k CPU, 32 GB memory). The update rates are similar regardless of the real object used for modeling. RBF-PV shows the fastest update rate (mean 1982.6 Hz), followed by RBF-FD (mean 990.3 Hz). This difference is due to the use of FD features. RF-FD shows the lowest update rate (mean 788.3 Hz), indicating that rendering force computation using RF models using 10 FD features is slower than RBF models using 2 FD features by approximately 20%. Nonetheless, all of these update rates are sufficient for the haptic rendering of relatively soft deformable objects.
When computing FD features, we use the FD coefficients precomputed assuming 1-ms sampling time. The average update times for RF-FD and RBF-FD are 1.27 ms and 1.01 ms (Table 3). Thus, we need to examine whether the force output errors caused by the sampling time difference in RF-FD are significant. RF-FD has a sampling time of 1.27 ms, and the derivative orders that have high variable importance are lower than 0.4 ( Figure 11). The errors in the value of the fractional derivative are negligible, according to Figure 20 in Appendix A. Thus, the slightly lower sampling time of RF-FD is expected not to cause practically noticeable problems. For confirmation, we resample the actual rendering results of position and force to have the exact 1-ms sampling time by linear interpolation and compare them with the rendering results. An example is shown in Figure 14   observed for the other objects. Therefore, the RF-FD model is quite robust to the variability in the update rate of haptic rendering, at least in the range of real materials tested in this work.

VI. PERCEPTUAL EXPERIMENT
We performed a user study to validate the perceptual performance of our data-driven haptic modeling and rendering framework for viscoelastic deformable objects. Participants' task was to compare the similarity between real and virtual objects and then represent it using a number. This similarity rating paradigm has been frequently used in related studies to data-driven haptics [16], [18], [19], [21], [26], [44].

1) PARTICIPANTS
We recruited 18 adults (5 females and 13 males, average age 24.2 years) for this experiment. None of them had a history of neurophysiological disorders. They were provided with written and verbal instructions about the experiment. They were paid KRW 15,000 ( USD 12) for compensation.

2) TASK
Participants sat comfortably in front of a computer monitor. They controlled the force-feedback device (Omega.3) with their right hands and used their left hands to enter responses using a keyboard; see Figure 15. A physical wall was placed between the haptic device and the monitor screen to block any visual cues regarding the device movements and the real/virtual objects. Figure 16 illustrates the experimental setup. When a real object was placed on the left side, the corresponding virtual object was rendered on the right side, and vice versa. Participants explored both objects while pressing and releasing them in the vertical direction using the haptic device without a time limit. To guide the haptic exploration, the monitor screen displayed the haptic interface point (HIP), a visual guideline, and a contact line. The HIP represents the position of the device tool tip in the virtual environment. The contact line was aligned with the top faces of the real and virtual  objects. The visual guideline indicated the maximum permissible penetration depth (7.5 to 11 mm) used for training the models. An invisible virtual wall was also rendered to define the workspace limit while helping participants interact around the centers of the real/virtual object's top faces.
We provided two specific instructions to participants to ensure the rendering to take place in the range covered by the data-driven models. First, they were asked not to penetrate the objects exceeding the visual guideline. Otherwise, a warning message was given. This precaution was to keep the rendering force stay within the force range of the training data. Second, they were asked not to move too fast while touching the objects. An error message was shown when their movement velocity was four times larger than the maximum velocity used to generate the training data. Additionally, participants wore noise-canceling headphones that played white noise to prevent any possible effect of external sound on the experimental results.

3) CONDITIONS
We tested 21 experimental conditions: the 5 deformable objects (Ecoflex 00-10, Ecoflex 00-30, Ecoflex 00-50, Dragon Skin FX, and Dragon Skin 10) × the 3 trained models (RF-FD, RBF-FD, and RBF-PV) + 5 upper bound conditions (per material) + 1 lower bound condition. The upper bound conditions were to compare two identical real objects included to obtain maximum similarity scores. In the lower bound condition, participants compared the two most different real objects: Ecoflex 00-10 (the softest) and Dragon Skin 10 (the hardest). This condition provided a perceptual anchor for the most dissimilar case.
The haptic rendering functions for RF-FD and RBF-FD were executed as fast as possible at the update rates shown in Table 3. For RBF-PV, the update rate was fixed at 1000 Hz.

4) PROCEDURE
The experiment consisted of three sessions of identical design. In one session, each experimental condition was tested once, except the low bound condition. The low-bound condition was repeated four times. Thus, one session included 24 trials. The order of the experimental conditions was randomized per session and participant. Each participant completed all of the three sessions in a within-subject design.
In each trial, participants compared two deformable (real or virtual) objects. The experimenter manually switched the real objects under the tip of the haptic device. The haptic device generated no force while a real object was explored. The positions (left or right) of the real and virtual objects were randomly selected per trial. Participants compared the haptic responses of the two objects sufficiently without a time limit. Then, they rated the perceptual similarity between the two objects by entering a number between 0 and 100 using a keyboard.
Before performing the main sessions, participants had a training session. The training session had the same procedure as that of the main session. Participants familiarized themselves with the haptic device and the experimental task. The experimenter provided feedback on participants' behavior of executing the task whenever necessary.
At the end of the experiment, participants had a debriefing session and answered a questionnaire consisting of the following five questions: Q1 Was there any unstable or abnormal behavior from the haptic device during the experiment? Q2 What criteria have you opted for rating similarity? Q3 Could you distinguish between real and virtual objects? Q4 If your answer to Q3 is yes, then what were your criteria? Q5 Which factors/aspects can improve the rendering?
In addition, we recorded the trajectory of the HIP in all trials and sessions.

1) OUTLIER REMOVAL
Some participants showed inconsistent responses in the experiment. For objective assessment, we computed the standard deviations (SDs) of the similarity scores collected in VOLUME 10, 2022  the same experimental conditions for each participant. Two participants, P4 and P13, showed very high inconsistency. Their SDs averaged across the 21 experimental conditions exceeded 30, while the mean SDs of the other participants were between 3.87 and 24.33. We also checked possible inconsistency due to the stimulus position by averaging the similarity scores collected when the trained model was placed left or right. Two participants, P6 and P13, failed this test. Their left-right difference scores were over 30, while the average of the other participants was 5.47. As a result, we removed the experimental data of the three participants, P4, P6, and P13, and proceeded with the data of the other 15 participants. Figure 17 shows the displacement-force hysteresis curves combining all the HIP trajectories and the command forces, collected from the interaction data of one participant, for each virtual object and data-driven model. A hysteresis curve for a typical viscoelastic object exhibits different behaviors when the object is pushed (loading) or released (unloading). In Figure 17, each curve's upper and lower parts represent the loading and unloading behaviors, forming an elliptical curve. Harder objects, e.g., Dragon Skin 10, shows a smaller hysteresis curve with higher stiffness, while softer objects, e.g., Ecoflex 00-10, shows a larger hysteresis curve with low stiffness. These behaviors are consistent with those reported in the literature for real and virtual viscoelastic objects [13], [43], [45]. Other participants resulted in similar displacement-force curves. Figure 18 presents the collected similarity scores SS in box plots for all the 21 experimental conditions. In the upper-bound conditions (comparison between the same real objects), the mean SS ranged from 81.6 to 88.5, with a grand mean of 84.2. In the lower-bound condition (comparing the softest and hardest real objects), the mean SS was very low at 8.7. The mean SS values of the three data-driven models were substantially greater than the lower bound, and they were comparable to the upper bounds. For the five materials, the RF-FD models resulted in the mean SS values between 73.8 and 85.5, RBF-FD between 72.0 and 81.7, and RBF-PV between 77.9 and 82.7.
To further examine the significant interaction term, we conducted simple effect tests on the SS of each Material. Method had a significant effect at Ecoflex 00-10 (F(3, 42) = 7.30, p < 0.001), Ecoflex 00-50 (F(3, 42) = 3.40, p = 0.026), and Dragon Skin FX (F(3, 42) = 2.87, p = 0.047). We ran Tukey's post-hoc tests for the significant cases. For Ecoflex 00-10, a significant difference in SS existed between the two FD models and the upper bound condition. For Ecoflex 00-50, RF-FD and the upper bound condition showed a significant difference. For Dragon Skin FX, no significant differences were found among the four conditions of Method.

4) QUESTIONNAIRE
The participants' responses to the five questions are summarized as follows.
Q1 Eleven out of the 15 participants sometimes felt (unstable) vibrations while exploring the virtual objects. Q2 All of the participants regarded the response force as the main criteria for similarity rating. Three participants also considered the change in response force when they varied the haptic exploration velocity. Q3 Eight participants could distinguish virtual objects from real objects. Q4 The major sensory cue for detecting virtual objects was oscillations that the participants sometimes experienced at deep penetration. Q5 Ten participants provided suggestions to improve the haptic rendering. Six of them were about removing unrealistic oscillations at deep penetration.

C. DISCUSSION
The similarity scores were bounded below by 8.7 (between the most different real objects) and above by 84.2 (between the same real objects). This resulted from the participants' general tendency to avoid extreme responses. All of the data-driven models led to high similarity scores comparable to or slightly smaller than the upper bound. The mean similarity scores of RF-FD, RBF-FD, and RBF-PV across all the objects were 78.1, 77.0, and 80.4, respectively. No significantly different cases were found among the pairs of the three data-driven models for any of the five materials ( Figure 18). Therefore, all of the data-driven models and rendering methods are highly and similarly effective in replicating the deformation dynamics of a real object. In one of the related studies [46], similar perceptual similarity scores were obtained for the same materials using the FEM-based modeling approach. The similarity scores of our RF-FD model were 75.4 and 77.3 for Ecoflex 00-30 and Ecoflex 00-50, respectively (Section VI-B). The FEM-based approach in [46] provided the similarity scores of 82.7 and 84.9 for the same two material cases. The scores of both studies are very high, although the absolute scores cannot be compared in a statistically meaningful way due to the difference in the detailed methods and environment in the experimental setups. Therefore, our work demonstrates that viscoelasticity can be modeled on a smaller amount of data in a greatly shorter time than any other models from the literature, while preserving the high accuracy.
The unrealistic oscillation problem mentioned by many participants generally occurred when the user touched a virtual object with a very fast velocity or penetrated very deep into the object. In our data-driven models, the dynamics information for such high velocity or deformation is not defined, and the rendering is likely to exhibit unexpected behaviors. In fact, this problem is common to all data-driven methods. One remedy can be extrapolation with a well-defined physicsbased model if the input variables exceed the data collection ranges.

VII. CONCLUSION
In this paper, we have presented a new data-driven approach for haptic modeling and rendering of viscoelastic deformable objects. Our approach is to train a regression forest model along with features consisting of fractional derivatives of the haptic device position. This new method can reduce the computational cost of modeling to a great extent while maintaining modeling accuracy and rendering quality. This advantage is demonstrated by physical and perceptual experimental data in reference to the other state-of-the-art methods that use radial basis functions to interpolate between the position and velocity data.
For future work, we plan to extend our methods to model inhomogeneous deformable objects and multiple finger exploration. Modeling of frictional behaviors on the objects may also be considered. We also plan to adopt more up-todate machine learning techniques, such as physics-informed deep neural networks.

APPENDIX A COMPUTATION OF FRACTIONAL DERIVATIVES
A fractional derivative, first proposed in 1695 by L'Hospital, is a derivative of arbitrary order, real or complex. As with integer-order derivatives, the r th -order fractional derivative of a function f (x) is described as: where D r represents the operator for the r th -order fractional differentiation.
For efficient computation, we use the Grünwald-Letnikov fractional derivative [47], as defined below: