A Hybrid Adaptive Controller for Soft Robot Interchangeability

Soft robots have been leveraged in considerable areas like surgery, rehabilitation, and bionics due to their softness, flexibility, and safety. However, it is challenging to produce two same soft robots even with the same mold and manufacturing process owing to the complexity of soft materials. Meanwhile, widespread usage of a system requires the ability to replace inner components without highly affecting system performance, which is interchangeability. Due to the necessity of this property, a hybrid adaptive controller is introduced to achieve interchangeability from the perspective of control approaches. This method utilizes an offline-trained recurrent neural network controller to cope with the nonlinear and delayed response from soft robots. Furthermore, an online optimizing kinematics controller is applied to decrease the error caused by the above neural network controller. Soft pneumatic robots with different deformation properties but the same mold have been included for validation experiments. In the experiments, the systems with different actuation configurations and the different robots follow the desired trajectory with errors of 3.3 +- 2.9% and 4.3 +- 4.1% compared with the working space length, respectively. Such an adaptive controller also shows good performance on different control frequencies and desired velocities. This controller is also compared with a model-based controller in simulation. This controller endows soft robots with the potential for wide application, and future work may include different offline and online controllers. A weight parameter adjusting strategy may also be proposed in the future.


I. INTRODUCTION
S OFT robots show infinite degrees of freedom (DOFs), adaptation to environments, and safety to humans.In this case, they have been exploited in various situations, such as medical applications [1], wearable devices [2], and navigation [3].Although soft robots have many advantages, the massive application of such robots is partly stuck by manufacturing errors.Many soft robots are made with molds and silicone gels.Such a manufacturing process produces soft robots with variances even with the same mold and manufacturing conditions.Moreover, expansion and shrinkage during usage will induce unrecoverable deformation.Soft robots will age due to the materials after a period, which may also change the deformation properties and robot motion.For symmetrical robots, it is challenging to build a totally symmetrical robot, and the softness in some directions may be different from the other directions.The phenomenon can be found in the unsymmetrical working space in [4], [5].Because of the above reasons, the modeling and control of a soft robot are challenging, and soft robots fail to be utilized as reliable components for massive applications currently.Interchangeability, which is the ability of a series of components able to replace each other without changing the system's performance [6], is a significant property of modern manufacturing.For example, bearings of the same size can take the place of the other ones, and they can be used in any direction.Similarly, as shown in Fig. 1, soft robot interchangeability is required among robots.It is also required among actuation configurations if the robots are symmetrical.Fig. 1-(A) illustrates that in a series of soft robots, one robot can replace the other one without affecting the system performance, and Fig. 1-(B) illustrates that one robot can be connected to valves with any reasonable actuation configurations.For wide applications, interchangeability is required.However, the manufacturing errors of soft robots mentioned above deteriorate the interchangeability.Hence we aim to enable soft robots to achieve interchangeability with the help of a hybrid adaptive controller.
In this paper, we endow soft robots with interchangeability from the view of control.Recently, neural networks (NNs) have been frequently chosen as the controller of soft robots because they can deal with nonlinear and sequential data, but such data-driven approaches are restricted by the training dataset.Therefore, an online updating controller is required to deal with the mismatch between the training and test robot.
This work leverages an offline trained long short term memory (LSTM) controller and an online optimizing kinematics controller to compose a hybrid adaptive controller and facilitates interchangeability on soft robots.We demonstrate our controller on different actuation configurations and a pair of soft robots which have manufacturing errors.Experimental results show that such robots have different deformation performances but achieve interchangeability thanks to the hybrid adaptive controller.This controller can also fulfill trajectory following tasks under different velocities and control frequencies and outperform a model-based controller in simulated soft arms.
The contributions of this paper are summarized as follows: 1) We introduce an offline trained LSTM controller based on data collected from a soft robot with low time and computational cost.2) We apply an online optimizing kinematics controller to compensate for the errors caused by the difference between the trained and testing robots.3) Experiments are carried out to prove that soft robots are interchangeable among actuation configurations and robots with the same manufacturing process.The controller is also adaptive to various velocities and control frequencies.This controller performs better than a model-based simulation controller.
The rest of the paper is structured as follows: Section II includes works related to neural network controllers and online controllers for soft robots; Section III introduces our proposed LSTM controller, kinematics controller, and hybrid adaptive controller; Section IV describes the experimental setup, including the robot, experimental devices, and communication diagram; Section V shows the experimental results which achieve the soft robot interchangeability thanks to the proposed adaptive controller; Section VI summarizes this paper and introduces some possible future works.

A. Neural network controllers for soft robots
NN is attractive to soft robot scholars because it is composed of nonlinear activation functions, which may be effective approaches to deal with the nonlinear behaviors of soft robots.In 2007, NN is first involved in soft robot research in [7], and furthermore, NN is leveraged for modeling [8] and control [9].
Recently, recurrent neural networks (RNN) have been shown as viable choices for soft robot research.The recurrent structures can cope with time-related data relationships.Therefore, RNN is an effective tool for soft robot control.One RNN named nonlinear autoregressive network with exogenous inputs (NARX) is applied in [10] for soft robot open loop control to achieve trajectory following tasks.In RNN, LSTM shows good performance due to its long term memory.The authors of [11] illustrate that LSTM outperforms an analytical model in control tasks.Also, LSTM is employed in [12] on a soft pneumatic finger for contact position and force estimation.LSTM in [13] estimates finger motion and contact force, and this paper shows that the estimation accuracy will improve with more sensing channels.The authors of [14] demonstrate that LSTM and one other RNN named gate recurrent units can estimate symmetrical soft robot motion under random contact and partial sensor failure.
Although the training datasets provide information for the neural networks, they pose limitations in their application areas meanwhile.In [5], NN trained with simulation dataset shows unsatisfying control performance on the real robot.To perform adaptation between different actuation configurations and different robots, an online adjusting controller is involved in this work.

B. Online controllers for soft robots
Statistical controllers, like the Gaussian mixture model [15] and Gaussian process regression [16], have low data amount requirements compared with NN controllers.In this case, they can update online and hence are robust to model misalignment and unpredicted noises.Kalman filter is also a popular choice for online-updating controllers.A Kalman filter is applied in [17] for Jacobian estimation and builds an optimal controller based on the estimated Jacobian matrix.
Jacobian methods have also been employed in soft robot control as online updating controllers.Such approaches suppose the differences between end positions and actions are linear.The Jacobian method is first introduced in [18].The actuation variables are decided by an optimization problem considering the Jacobian matrix as a constraint.Instead, the inverse Jacobian matrix for control is applied in [19].Such a simple approach can also be leveraged to control contact force with a force-displacement model [20].
Due to the linear assumption and noisy response, the Jacobian method requires a high control frequency, but the kinematics controller applied in this paper works well even at a low frequency.Compared with online learning NN which requires a data collection process and cannot update every time step [5], the kinematics controller can update with the same frequency as the control strategy.The summary of more RNN and online controllers applied to soft robot control can be found in [21], [22].

III. METHOD
This section introduces the hybrid adaptive controller exploited in this paper.We train an LSTM network offline at first and propose an online optimizing kinematics controller for adaptation.The estimated results from these two controllers are weighed and executed finally.The control strategy diagram is depicted in Fig. 2. The sensor in the robot system provides the end position for the controller.The desired trajectory and actuation in the last steps are also sent to the controller.The kinematics controller updates based on the dataset collected currently, and both the kinematics controller and LSTM estimate actuation for the next step.A weight parameter is used to weigh the estimated actuation variables from those two controllers, and the weighted actuation is sent to the real robot system to achieve the trajectory following tasks.

A. LSTM Controller
In an RNN network, the information from the previous steps can be taken into consideration thanks to the hidden states h, and LSTM can select proper information to forget and remember with the cell states C. For the cell representing step t, the calculation is where f t , i t , C t , o t , and h t denote the forget gate, input gate, cell state, output gate, and hidden state of step t, respectively.sig is the sigmoid function.x t is the input of step t.W * and b * represent the weight and bias parameters of the corresponding calculation, and × is the Hadamard product operator in these equations.
To utilize an LSTM network as a controller, a network is trained with a collected dataset offline.The previous actuation values and positions are employed as inputs, and the current actuation values are employed as the output.The training process can be summarized as a LST M,t = LST M (p t+1 ;p t , p t−1 , . . .; where p t ∈ R 2×1 and a t ∈ R 2×1 denote the end position and actuation value at step t, and a LST M,t ∈ R 2×1 represents the actuation value estimated by LSTM controller at step t.
In this paper, we employ the 2D position of the robot end to represent robot states, and there are two independent actuation variables.During the control process, the end position from the desired trajectory will take the place of p t+1 , and LSTM will estimate the corresponding actuation values according to the desired trajectory and robot motion history.The hyperparameter choices like the previous step number, layer number, and hidden state feature number will be introduced in Subsection V-A.

B. Kinematics Controller
The motivation behind utilizing an online updating controller lies with the observation that neural network controllers are constrained by the training dataset and fail to adapt to new robots.Therefore, the kinematics controller is leveraged to compensate for the mismatch between the test robot and the robot training the LSTM.Due to its low data amount requirement, the kinematics controller can learn the robot dynamics online every time step.The kinematics matrix updates by solving the following optimization problem: △p denotes the estimation loss of the kinematics matrix K.We apply the identity matrix I 2×2 as the initial kinematics matrix in the beginning, and the optimized matrix of this time step will be utilized as the initial guess of the next time step.
To utilize the kinematics method as a controller, similar to [23], we employ the other optimization problem to calculate the actuation variables.The optimization problem is where p tar,t+1 ∈ R 2×1 is the target position of the step t + 1 and a K,t ∈ R 2×1 denotes the actuation variable estimated by the kinematics controller.We may include the inverse matrix for controlling directly or the other online controller in the future.

C. Hybrid Controller
The offline trained LSTM controller and online optimizing kinematics controller are utilized to cope with the nonlinear and delayed response from soft robots while adapting the real test robots.A weighting parameter w is applied to compose this hybrid controller, which can be summarized as where a hy,t represents the actuation value estimated by the hybrid controller at step t.In this work, w is set as 0.1 after trials and errors and may be adjusted online inspired by the Kalman filter [24] in future work.3-(A).For simplicity, two symmetrical chambers are set into a pair, and only one chamber in each pair will be actuated in a time step according to the actuation values.In this case, two actuation values are applied to control a soft robot with two chamber pairs, and end positions on a 2D plane are applied to build a robot motion dataset.For example, if the pressure of the left and front chambers are controlled to improve, the end will move to the right and back.Due to the silicone and origami structure material constraints, the maximal actuation pressure is about 0.5 bar.
The experimental devices and communication diagram are shown in Fig. 4-(A) and (B).NDI Polaris Spectra, an optical measurement system, is applied for end position sensing.Its error is 0.25mm RMS, which is lower than that of NDI Aurora (0.48mm RMS), the EM tracking system widely applied in soft robot research.One robot is fixed vertically on a base, and a reflection marker is fixed on the end of the robot.The A marker is fixed on the end of the soft robot.The robot is actuated by four valves which are controlled by a control broad.A personal computer is utilized for communicating with the sensing system and control board.
(A) (B) shows 20000 samples collected from robot α 0 , which are shown as green dots and can be seen as the working space of the robot.This dataset is collected with a frequency of about 3.3Hz, and such frequency is applied to the control and sensing system in the following experiments.During the training process, the robot is driven by random actuation variables.The actuation value differences between two adjacent steps are constrained to generate a smooth motion.It takes roughly 100 minutes to collect this dataset.The original length of extendable chambers is about 25mm, and the working space area of a robot is 60.94mm × 60.94mm.It is evident that although the robot is expected to be totally symmetrical, the working space is asymmetrical.The left and top parts of the working space are more challenging to reach than the others, which shows that there are manufacturing errors among actuation configurations.Fig. 5-(B) shows trajectories from different robots based on the same actuation sequence.The average error among α 0 and α * is 1.52mm, and the average error among α 0 and β * is 5.21mmAlthough they are from the same molds, they generate different trajectories, showing manufacturing errors among robots.

A. LSTM Controller Performance
Based on the dataset, we train an LSTM network as the offline controller with PyTroch [25].The dataset shown in Fig. 5-(A) is divided into a training dataset (70%), a validation dataset (10%), and a test dataset (20%.)The validation dataset is exploited by the early stopping strategy during training to prevent overfitting, and we test the trained LSTM with the test dataset.According to Subsection III-A, the actuation variables and positions from several previous steps are employed as input, and the output is the estimated actuation variable for the current time step.
To obtain a proper hyperparameter combination, various combinations are tried.We train NNs with several hyperparameter combinations on the training dataset and test them on the test dataset.The test errors are shown in Table I.The network with the lowest error is considered the best controller before real experiments.In this case, 10 previous time steps, 4 layers, hidden state size 128, and dropout rate 0.1 are adopted for the following experiments.Except for this combination, the other choices also provide low errors; hence careful and time-consuming fine-tuning is not necessary for massive applications.It takes roughly 100 minutes to collect the 20000 data points and under 5 minutes to train an LSTM network.The collection and training processes are far shorter than those in [26] (about 10 hours), which applies timeconsuming reinforcement learning.Two trajectories are proposed as the following targets for different controllers as shown in Fig. 6-(A) and (B).The first trajectory is composed of one circle and one square.The second trajectory is composed of one triangle and one curve.These two trajectories contain acute angles, right angles, obtuse angles, straight lines, and curved lines.Each trajectory is divided into 400 positions, which are employed as the target positions in the following tasks.These two trajectories are named as A and B in this paper.
The robot α 0 , which generates the dataset, is controlled to follow these trajectories with the LSTM controller, and the performance is shown in Fig. 6-(A) and (B).The distance error to working space length (60.94mm) ratio is applied to represent the trajectory following errors in this paper.The following errors and standard deviations are shown in Table II, which illustrates that the LSTM controller fulfills trajectory following performance perfectly with the robot α 0 .Then the robot α 0 with different actuation configurations, represented by α * in Fig. 3-(C), and the robots β * in Fig. 3-(B) are controlled to follow the trajectories.Fig. 7 demonstrates that these hardware implementations obtain higher errors than the robot α 0 under the control of LSTM.Their errors and standard deviations can be found in Table II.Each subfigure in Fig. 7 includes nine experiments, considering three configurations or robots and three trials on each configuration or robot.LSTM obtains both high errors and standard deviations on actuation configurations α * and robots β * , which illustrates that this data-driven approach is not effective and robust under the robot beyond the training dataset.

B. Hybrid Controller Performance
To compensate for the difference between the test robot α * , β * and robot α 0 , the online optimizing kinematics controller is involved for control according to Eq. 5 with the help of PyTroch [25].Fig. 8 illustrates that the hybrid controller decreases following errors and achieves interchangeability.The errors and standard deviations can be found in Table II.Considering the high LSTM trajectory following errors of β 3 (over 10%), we improve the weight parameter w from 0.1 to 0.5, and the errors of the hybrid controller decrease to under 6%.The weight change is only applied in β 3 , and 0.1 is still applied in the other robots.To demonstrate the adaptation of the hybrid controller, we exploit the controller in experiments with various control frequencies and velocities on the robots α * and β * and the trajectory A and B. To test the adaptation of control frequency, we change the frequency from 3.3Hz to 2.5Hz and 4Hz.In this case, the time step is about 0.4s and 0.25s.To test the adaptation of velocity, the time step number changes from 400 to 300 and 500.In the experiments of Fig. 7, the whole desired trajectory is discretized into 400 positions, and here we discretize it into 300 and 500 positions.In this case, the desired velocity changes indirectly.The errors and derivations can be found in Table III, and the following performance of α * and trajectory A is shown in Fig. 9. Overall, the adaptive controller can fulfill these tasks with relatively low errors.The LSTM network is trained with data from 3.3Hz, but the adaptive hybrid controller obtains good performance even on different frequencies.

C. Ablation experiments
An ablation study is applied to explore the function of a certain component in a system by running the system without that part.Such an experiment is widely used in machine learning [27].In this paper, ablation experiments are carried out for the hybrid controller to demonstrate the necessity of each part of the controller.The performance of the LSTM controller has been shown in Fig. 7.This component gains high performance in the original robot but may fail in new robots.To some extent, it can be seen as a global controller because it does not rely highly on feedback and has the potential to be used everywhere in the working space.
Considering the kinematics controller, Fig. 10 shows its performance on the robot α 1 and the trajectory B. The hybrid controller is utilized at first, and the kinematics replaces it at 100th, 200th, 300th, and 350th time steps, which is achieved by changing the weight parameter w from 0.1 to 1 in Eq. 5.
The following errors and derivations are shown in Table IV.The kinematics controller shows high following errors in the first three tasks.After switching to the kinematics controller, the robot still tries to follow the trajectories in several time steps but starts to oscillate after a while, which illustrates that it is just a local controller and may fail considering a global task with turning.However, the fourth task, as shown in Fig. 10-(D), is achieved perfectly, and the error (3.9%) is even lower than that of the hybrid controller on α 1 (4.1%).This error reduction demonstrates that the kinematics controller outperforms the hybrid controller, and the weighting parameter 1 is a better choice for some simple straight-line following tasks.Meanwhile, the weighting parameter w changes from 0.1 to 0.5 for complex tasks like the robot β 3 trajectory following.In this case, an online weight-adjusting strategy may improve the overall controller performance.Besides real pneumatic four-chamber robots, we also apply our hybrid adaptive controller to simulated 1-DOF soft arms and compare it with a model-based controller.The soft arm simulation is built based on PyElastica [28], as shown in Fig. 11-(A).The working space in these experiments is 0-2.4 rads and data in 1000 time steps is collected for LSTM training.Based on the Constant Curvature (CC) model [29] and pseudorigid-body model [30], we build the soft robot dynamics model as where q is the bending angle, B, C, and K are the robot's inertia, damping, and stiffness factor, and τ is the control input.
According to [29], the controller is τ = B q + C q + K q + I (q − q), (7) where q is the desired angle trajectory, and I is the gain of the integral action.
To test the adaption of controllers, the parameters of these two controllers are optimized on a soft arm with Young's modulus 10 kPa and Possion ratio 0.5.These deformation parameters have been applied in [31].Then, we test them on total 25 robots with Young's modulus 5, 7.5, 10, 12.5, 15 kPa and Possion ratio 0.25, 0.375, 0.5, 0.625, 0.75, as shown in Fig. 11-(B) and (C).The test errors and standard deviations of the hybrid adaptive controller and CC controller are 2.8 ± 3.4% and 3.0±3.9%.Therefore, our controller outperforms a modelbased controller and shows adaptive control ability on various robots.

VI. CONCLUSION AND DISCUSSION
This work aims to achieve soft robot interchangeability with the help of an adaptive controller.An offline trained LSTM controller and an online optimizing kinematics controller compose this hybrid adaptive controller, and such a controller can obtain satisfying control performance on different actuation configurations and robots.These results demonstrate that such a controller is an effective and time-saving solution for soft robot control with different robots, trajectories, and frequencies.Furthermore, it paves the way for the massive application of soft robots in industry.
In future work, the other sequence-related neural network, like gated recurrent units, may take the place of LSTM.Also, some other online learning approaches, like the transfer function controller, the Jacobian method, and Gaussian mixture regression, may replace the kinematics controller.We may include different numbers of timesteps in the kinematics controller to explore the influence of this parameter.Another fundamental issue that must not be neglected is the choice of the weight parameter.An online updating strategy may be proposed for adjusting the kinematics weight inspired by the Kalman gain updating strategy in the Kalman filter.We also plan to achieve the interpretability of neural networks to some extent from the view of mechanics and robotics like [32].Owing to its ability to adjust to various robots, this control strategy may also be used on robots with different physical properties and complex robot systems like the reconfigurable modular robot system [33].

Fig. 1 .
Fig. 1.Soft robot interchangeability.(A) One soft robot can replace the other robot with the same mold and manufacturing conditions.(B) One symmetrical soft robot can be rotated and connected with valves with different actuation configurations.The LSTM controller is utilized to achieve trajectory following tasks and works well on (C) robot α 0 but gets high tracking errors on (D) robot α * .Meanwhile, the hybrid adaptive controller can achieve the trajectory following task on (E) robot α * .Light green dots represent the working space.The red and blue lines show the desired and real trajectories, respectively.The light blue area illustrates the error band in nine trials.

Fig. 2 .
Fig.2.Control strategy diagram.The sensor in the robot system provides the end position for the controller.The desired trajectory and actuation in the last steps are also sent to the controller.The kinematics controller updates based on the dataset collected currently, and both the kinematics controller and LSTM estimate actuation for the next step.A weight parameter is used to weigh the estimated actuation variables from those two controllers, and the weighted actuation is sent to the real robot system to achieve the trajectory following tasks.

Fig. 3 .
Fig. 3. (A) A robot is composed of an origami structure and a silicone structure containing four chambers.(B) Four different soft robots from the same mold, which are α 0 , β 1 , β 2 , and β 3 .(C) Four different actuation configurations of the same robot, which are α 0 , α 1 , α 2 , and α 3 .(D) Section view of a soft robot.There are four chambers inside a robot, and an external constraint limits the radial deformation.

Fig. 4 .
Fig. 4. (A) Experiment devices and (B) hardware communication diagram.A marker is fixed on the end of the soft robot.The robot is actuated by four valves which are controlled by a control broad.A personal computer is utilized for communicating with the sensing system and control board.

Fig. 5 .
Fig. 5. (A) Working space.The length of the working space side is 60.94mm.(B)Trajectories from α 0 , α * , and β * .They are actuated with the same actuation sequence but produce different trajectories.

Fig. 6 .
Fig. 6.Trajectory following results on robot α 0 .LSTM tracking performance on (A) A and (B) B trajectories.Light green dots represent the working space.The red and blue lines show the desired and real trajectories, respectively.The light blue area illustrates the error band in three trials.

Fig. 8 -Fig. 7 .Fig. 8 .
Fig. 7. LSTM controller performance on (A) robot α * , A trajectory; (B) robot α * , B trajectory; (C) robot β * , A trajectory; (D) robot β * , B trajectory.Light green dots represent the working space.The red and blue lines show the desired and the average of the real trajectories, respectively.The light blue area illustrates the error band in nine trials.

Fig. 9 .
Fig. 9. Hybrid controller performance on robot α * with (A) 4Hz, (B) 2.5Hz, (C) 300 time steps, (D) 500 time steps and trajectory A. Light green dots represent the working space.The red and blue lines show the desired and real trajectories, respectively.The light blue area illustrates the error band in three trials.

Fig. 10 .
Fig. 10.Kinematics controller performance on the robot α * .The trajectories, end positions on the x-axis and y-axis are shown.The hybrid controller is employed at first, and the kinematics controller starts at (A) 50th, (B) 100th, (C) 200th, and (D) 300th, respectively.Light green dots represent the working space.The yellow line shows the start time step of the kinematics controller.The red and blue lines show the desired and real trajectories, respectively.The light blue area illustrates the error band in three trials.

Fig. 11 .
Fig. 11.(A) Diagram of soft arms from 0 to 2.4 rads.Tracking performance with (B) hybrid adaptive controller and (C) CC controller.The red and blue lines show the desired and the average of the real trajectories, respectively.The light blue area illustrates the error band in 25 trials.

TABLE IV KINEMATICS
CONTROLLER FOLLOWING ERRORS WITH α 1 AND TRAJECTORY B