Modeling Fabric-Type Actuator Using Point Clouds by Deep Learning

Flexible actuators are popular in the consumer and medical fields because of their flexibility and compliance. However, they are typically difficult to model because of their viscoelasticity and nonlinearity. This letter proposes a method for correcting the deformation of the simulated flexible robots to make it similar to the deformation of real robots using point clouds by deep learning. Long short-term memory (LSTM) can simulate the next frame of actuator deformation from the previous frames of deformations. In this study, we presented the robots with four different muscle structures. We found that using an encoder–LSTM–decoder network can improve the similarity between the deformation of a learned muscle structure and the real deformation and is also effective in correcting the deformation of the unlearned structures. Our correction method reduced the average Chamfer distance of the simulated point clouds of the basic-type structure actuator from 15.89 to 7.81. This research can provide a new concept for future flexible robot modeling using point clouds.

and controlling soft actuators because the material's struc-23 tural compliance, nonlinearity action, and the viscoelasticity 24 in the material result in complex and unpredictable behav-25 iors [13]. Over the past several decades, deep learning has 26 made unprecedented progress. The insights that artificial 27 intelligence technology can extract from complex data benefit 28 the field of medicine [14], [15], [16], autonomous driving 29 [17], [18], [19], [20], and many other fields. Deep learning is 30 The associate editor coordinating the review of this manuscript and approving it for publication was Tao Wang . also well-known for effectively solving nonlinear problems 31 in soft robotics [21]. The finite element method (FEM) and 32 position-based dynamics (PBD), as current mainstream meth-33 ods, have some limitations in modeling, such as the balance 34 between computing power consumption and model accuracy; 35 nonetheless, and deep learning can compensate for these 36 shortcomings to some extent [22], [23], [24], [25], [26], [27]. 37 This paper presents a method of modeling four soft 38 fabric-type actuators by correcting the simulated point clouds 39 through deep learning to make them approach the point 40 clouds obtained from real actuators using four depth sen-41 sors. This paper also aims to solve the above limitations of 42 fabric-type actuators by deep learning. PointNet [28] is a 43 robust framework that can transform the three-dimensional 44 (3D) coordinates of point clouds into local or global features. 45 It has achieved excellent performance recently in point cloud 46 classification [29], [30], [31] and semantic segmentation [32], 47 [33], [34], [35] tasks. In this research, the encoder of PointNet 48 was used to extract the global features of the point clouds, 49 and then the simulated point cloud features of the previous 50 reduce the computational cost of every deformation simu-106 lation, we used a PBD-based simulator. PBD-based simula-107 tors can simulate the simple fabric deformations caused by 108 wind shaking or collisions with other objects, but the simu-109 lation accuracy is insufficient for complex nonlinear defor-110 mations [41]. The fabric-type actuator, which is the subject 111 of this study, placed multiple artificial muscles on a fabric. 112 As the artificial muscles contract, the fabric can deform to 113 the expected shape. However, the contact between the fabric 114 and the artificial muscles created a nonlinear mechanical 115 effect when the actuator deformed. The simulation accuracy 116 was degraded by nonlinear mechanical effects, and there 117 was a large difference between the simulation results and 118 the actual deformation of the actuator. Therefore, after using 119 the PBD-based simulator to simulate the deformation of the 120 actuator, we used a deep learning-based correction system to 121 correct the nonlinear mechanical actions that the PBD-based 122 simulator could not simulate. In deep learning, LSTM has 123 many advantages over feedforward networks for nonlinear 124  wind-bridge interaction system by LSTM to accurately and 130 efficiently predict bridges' flutter and post-flutter behavior, 131 with nonlinear unsteady aerodynamics induced by bridge 132 motion. In this study, LSTM plays the primary role of cor-133 rection because it can process time-series data and is suitable 134 for connecting high-dimensional features (global features) 135 of simulated and actual point clouds. With an appropriate 136 correction method, the real deformation of the fabric-type 137 actuator can be simulated.

138
The main contribution of this research is to combine deep 139 learning with traditional simulation instead of solely using 140 deep learning modeling. The correction system saved more 141 computing power than the conventional modeling FEM and is 142 more controllable than deep learning-based black-box mod-143 els because the muscle distribution and parameters of the 144 actuator were set in Unity. This research can provide rapid 145 simulation close to real actuators for future robot design and 146 provide an alternative solution for future real-time simulation. 147 There are three basic elements of the fabric-type actuator: 150 the fabric body, the artificial muscle, and the fixed point. 151 The fabric body promotes 3D deformation of the actuator 152 by changing the force direction of the artificial muscle. The 153 artificial muscle contracts in the axial direction of the muscle 154 when air pressure is applied to it. The fixed point transmits 155 the force of the artificial muscle to the fabric body by limiting 156 their relative movement. The fabric-type actuator was controlled by contracting 168 and stretching a plurality of artificial muscles. Thus, a con-169 trol system is required to independently control the con-   described, but the higher the calculation cost. The size of the 200 fabric model ( Fig. 4(a)) was set to 1.00 m in length and width, 201 and 0.01 m in thickness, considering the ease of processing 202 objects on Unity.

203
In addition, the artificial muscle was created by connecting 204 multiple tiny rigid body models to form a constrained chain. 205 The size of each tiny rigid body model is a cuboid with a 206 length of 0.01 m, a width of 0.01 m, and a height of 0.05 m. 207 The tiny rigid body model was connected by a Configurable 208 Joint option in Unity to realize the contraction of the artificial 209 muscle. The local coordinate axes of the tiny rigid body 210 model with Configurable Joint are depicted in Fig. 4(b). 211 When no air pressure is applied to the artificial muscle, 212 the movement of the rigid body model in the local y-axis 213 direction is restricted to zero. When air pressure is applied, 214 the movement of the rigid body model in the local y-axis 215 direction can be restricted to a distance that matches the 216 contraction rate of the actual artificial muscle, as described 217 in [52] and [53]. The tiny rigid body model can be moved 218 FIGURE 3. The robot control system comprises an air compressor that supplies air pressure to power artificial muscles and a regulation unit that includes a filter regulator to maintain air pressure. In addition, it comprises an electro-pneumatic regulator that controls air pressure supplied to individual artificial muscles, thereby controlling the deformation of the fabric actuator.  one sensor, the number of point clouds collected from the 231 actual actuator differed in each deformation, and we mea-232 sured the entire fabric-type actuator without omission by 233 setting four sensors from multiple angles. By converting the 234 point cloud obtained from each sensor into the world coordi-235 nate system based on the arrangement of each sensor and the 236 iterative closest point (ICP) algorithm [54], the deformation 237 of the actuator can be acquired as a completed point cloud 238 consisting of point clouds from different cameras. In the ICP 239 algorithm, we can find the translation T and rotation R that 240 minimize the sum of the squared error in Equation (2) from 241 the two corresponding point sets P and X in Equation (1): where x i and p i denote the corresponding points, and N p 246 represents the number of points. This computing process 247 was completed using MATLAB R Computer Vision Toolbox. 248 After the ICP registration, we downsampled the registered 249 point clouds to approximately 3,800 points through a box grid 250 filter. To facilitate the comparison of the difference between 251 the actual and the simulated point clouds in the future, we ran-252 domly sampled the downsampled point clouds to make them 253 be the same number as the simulated point clouds (3,362 254 points). The actuator and the four depth sensors were fixed 255 on the metal frame to prevent changes in relative position.

256
The number of particles placed on one side of the simulated 257 fabric-type actuator model was 41 × 41. Therefore, the total 258 number of particles on the surface of the actuator model 259 representing the point cloud was 41 × 41 × 2, totaling 3,362. 260 The created fabric model with a point cloud is depicted in 261 Fig. 6(b).    According to PointNet [28], Autoencoder1 was trained by 304 simulated point clouds (Autoencoder1, Fig. 8) and took its 305 encoder as the encoder of the correction system. Further-306 more, Autoencoder2 was trained by the real point clouds 307 (Autoencoder2, Fig. 8) and took its decoder as the decoder 308 of the correction system. 3) was used to evaluate the loss function, and train-325 ing was stopped when the loss function stopped improving on 326 the validation set. Early stopping had two main parameters: 327 min delta and patience. An absolute change in loss function 328 less than min delta will be considered no improvement, and it 329 was set to 0.1 in this study. Patience is the number of epochs 330 with no improvement, after which training will be stopped. 331 Patience was set to 10 when training Autoencoder1 and 332 Autoencoder2, whereas it was set to 30 when training LSTM 333 because the training curve of LSTM drops more slowly. 334 After the correction system had been trained, we evaluated 335 whether the corrected simulated point cloud can accurately 336 describe the actual point cloud. Chamfer Distance (CD) [55] 337 was used to evaluate the extent to which the simulated point 338 cloud can fit the actual point cloud. In addition, the loss 339 function of our neural network was also based on the CD, 340 and the neural network was continuously trained toward the 341 smaller value of the CD. In this study, the CD was defined 342 using Equation (3) and was an average index showing the 343 similarity between two groups of point clouds:

359
The measurement frame rate was set to 60 Hz, and a 360 time-series point cloud for 90 frames was acquired from 361 the start of the application of the pneumatic pattern. The 362 four types of artificial muscle actuator configurations used 363 for measurement are shown in Fig. 9, and Fig. 10 depicts 364 an example of basic-type robot measurement results. The 365 deformation of the actuator was described by point clouds 366 measured using four depth sensors.

367
The side type was obtained by rotating the basic type 368 90 • counterclockwise. There were subtle differences because 369 the basic and side types have different camera angles, even 370 with the same control input. Generally, tilting the actuator 371 at a certain angle can be used as a type of learning data 372 augmentation in deep learning because convolutional neural 373    shuffled. The number of deformations in the cross and ver-394 tical types was significantly smaller than that of the basic 395 type because there were fewer variations in deformation 396 with only one side installed artificial muscles. Moreover, the 397 vertical-type actuator was used only for testing, whereas the 398 cross-type actuator was used only for training. The number 399 of deformations and frames obtained from the simulation 400 was the same as the number of real deformations under each 401 configuration. According to the number of particles in the 402 fabric model in the simulation, the number of point clouds 403 in each frame was processed into 3,362 points.

405
In this study, TensorFlow 2.7 was used to construct a the 406 neural network. In the first step, we used the simulated 407 point cloud as the training data and label data identically in 408 Autoencoder1: the input and target data are the same. Then, 409 the global feature of the simulated point cloud can be obtained 410 by the encoder of Autoencoder1. In the second step, we used 411 the real point cloud as the training and label data identically 412 in Autoencoder2, and the global features of the real point 413 cloud can be obtained by the encoder of Autoencoder2. The 414 learning rate of 0.001 with the Adam optimizer was used 415 in the training process of Autoencoder1 and Autoencoder2. 416 In addition, a dropout layer with a dropout rate of 0.05 was 417 used in the hidden layer in the decoder of Autoencoder1 and 418 Autoencoder2 to improve the generalization ability of the 419 simulated point cloud. In the third step, LSTM was employed 420 to predict the global features of real point clouds from the 421 global features of simulated point clouds. The training data 422 were the global features of the simulated point cloud obtained 423 by Autoencoder1, and the label data were the global features 424 of the real point cloud obtained by Autoencoder2, as pre-425 viously discussed. The learning rate of 0.00001 with the 426 Adam optimizer was used in the training process of LSTM. 427 LSTM was learned by using the global features converted by 428 the above encoder and decoder. In this study, m = 3,362, n 429 = 1,024, and l = 5. Finally, the encoder of Autoencoder1, 430 LSTM, and the decoder of Autoencoder2 were connected 431 sequentially, forming a structure of encoder-LSTM-decoder. 432 As in the training process in Autoencoder, the loss function 433 uses the Euclidean distance and stops learning when the 434 validation loss does not decrease. After training, the encoder, 435 decoder, and LSTM will be concatenated into a structure 436 of encoder-LSTM-decoder. The simulated point cloud data 437 of five consecutive frames can be transformed into global 438 features through the encoder, and then, these global features 439 were used to predict the global features of the next frame of 440 real point clouds by LSTM, which can be decoded into real 441 point cloud data by the decoder.      In addition, Figs. 11 -13 shows the CD frequency distri-459 bution for each type of actuator before and after correction. 460 Comparing the simulations without the correction system 461 (Fig. 11(a), Fig. 12(a)) and the simulations with the correction 462 system (Fig. 11(b), Fig. 12(b)) in basic type and side types, 463 the CD of the point cloud decreased after correction; i.e., the 464 overall value of CD decreased. Therefore, based on our neural 465 network, the correction system can learn high-dimensional 466 features of muscle structure in the actuator, and it is possible 467 to simulate deformations close to those of the actual actuator. 468 Comparing the simulation of the vertical type ( Fig. 13(a)) 469 and the simulations with the correction system ( Fig. 13(b)), 470 we find that CD, greater than 17, is improved after correc-471 tion. Although the vertical-type actuator has an untrained 472 structure, the correction system can reduce CD. Therefore, 473 even if the actuator structure is untrained, deformation can be 474 simulated to some extent using the correction system.   the simulated and real point clouds in the vertical type are 496 mainly corrected, but new errors were generated. 498 We compared the trained and untrained data with the sim-499 ulation results only and the simulation results with the cor-500 rection system. The correction system can correct nonlinear 501 mechanical actions difficult to simulate using the PBD-based 502 simulator. In this study, the learned structures performed 503 well in the evaluation stage, and the unlearned structures 504 also improved. However, there is still room for improvement. 505 For example, the neural network has not sufficiently learned 506 about the deformation of unlearned muscle structures, and 507 the correction system's accuracy can still be improved. More-508 over, this study employed deep learning to learn time-series 509 point cloud features to compensate for the lack of accuracy of 510 PBD modeling on flexible objects and to facilitate the control 511 of flexible robots in the future.

512
In addition, there are only three types of artificial mus-513 cle distributions (the types of basic, side, and cross) used 514 in the training set in this study. From Table 2 and 3 and 515 Fig. 11 and 12, for the same muscle type, more training data 516 can improve the correction ability of the neural network. 517 However, for unseen data, we tentatively put forward that 518 adding more muscle distribution types to the training set can 519 improve the generalization ability of the neural network, that 520 is, the neural network's ability to correct the simulated point 521 cloud of unlearned muscle structures will be improved.

522
The structure of the constructed correction system was 523 developed using the PointNet autoencoder and LSTM. 524 In future research, more neural network structures can be used 525 to solve this problem. For example, the attention mechanism. 526 The way a neural network learns is similar to how the human 527 brain thinks, and there may be an advanced framework for 528 extracting point cloud features in the future. Recently, various 529 VOLUME 10, 2022     will also improve the simulation accuracy for untrained struc-537 tures. Alternatively, simply augmenting the learning data for 538 training the correction system may also increase simulation 539 accuracy, but this may train the neural network deeper rather 540 than wider. In addition, we look forward to adding more 541 muscle structures to the training set, which will enable neural 542 networks to fit higher-dimensional features, and this will 543 have a positive effect on the correction of unknown muscle 544 structures.

545
This work is intended for use in robot design or con-546 trol. When the fabric deformation is driven to the expected 547 deformation, many simulated parameters related to mus-548 cle and fabric in FEM must be optimized. These muscle 549 parameters include the number of muscles and the length, 550 direction, diameter, and mount point of muscles to be con-551 sidered. Therefore, FEM-based design is challenging because 552 it requires extensive deformation simulation using various 553 VOLUME 10, 2022 parameters. The constructed correction system can be used 554 to design a structure that can respond to the intended defor-555 mations by PBD. In future research, the correction system can 556 facilitate robot designers with rapid real-time simulation and 557 verification of complex wearable devices without using real 558 actuators, owing to the low computational cost. Moreover, 559 we are expecting to evaluate the difference in the specific 560 computational cost between the proposed and FEM method.