Reset Strategy for Output Feedback Multiple Models MRAC Applied to DEAP

The smart actuators are rapidly developing in the recent years. Dielectric Electroactive Polymer actuators are very important smart actuators due to their features like softness, high force ratio, fast operation and silence. In recent year a set of dynamic models for DEAP actuators have been developed by various authors. Relying on these models it is possible to design an wide range of feedback controllers. In our work, we develop the indirect adaptive controller for Dielectric Electroactive Polymer actuator exploiting the multiple models approach with second layer adaptation. The results presented in this paper prove that in the case of piecewise continuous parameters, the benefits of second level adaptation can be lost. To solve this problem, a new resetting algorithm is proposed. The efficiency of the proposed control method is verified by a simulation on a simple motivation example and DEAP actuator model.


I. INTRODUCTION
Smart materials are currently very perspective types of materials which actuate on different stimuli [1]. They are used to build smart actuators like Dielectric Electroactive Polymer (DEAP) actuators. Their main features like high force -volume ratio, soft membrane and fast response time made them useful to build many prototypes like pump [2], artificial muscles [3] and many others [4], [5]. A well designed control system is a very important aspect of design of these devices. To complete this task the nonlinear models of DEAP actuators were created [4], [6]- [9]. Relying on linear and nonlinear models a wide range of control systems for DEAP actuator was created. In works [7], [8], the PID controller was studied. The feedback control was designed to obtain high precision in work [10] and the compensation of hysteresis was proposed in work [11]. The sliding model control was used to build the controller in work [12]. The simple alternative was proposed in [13] where the open loop control is designed. Additionally, the machine learning approach is also studied. For instance, the reinforcement learning was used to find a neural network controller in work [14]. The intelligent control based on fuzzy system was designed in work [15].
The associate editor coordinating the review of this manuscript and approving it for publication was Engang Tian .
In our approach we study the design of an adaptive controller. This technique is well known for linear and nonlinear systems with linear parametrization [16]- [19]. In recent years, the multiple model technique was reported to improve significantly the transients of adaptive systems [18], [20], [21]. In the past, the multiple model technique based on switching strategy was applied to control system [18]. This technique is said to increase the efficiency of a controlled system, however, it required the oversized number of models [20]. This problem was solved in work [22] by applying an adaptive controller with second level adaptation. This solution was further studied for nonlinear system with linear parametrization [23], fractional system [24], observer design [25], [26] and artificial intelligence [27], [28]. The extension of this technique was also proposed in work [29] with second level adaptation based on error integration. A new approach presented in [30] considers a three layered adaptive control. The application of an adaptive control to linear varying systems and periodic systems is shown in [31] and [27] respectively.
In our work we design an adaptive controller for DEAP actuator taking into account an indirect adaptive control based on multiple models. It is worth mentioning that a direct adaptive controller was previously constructed by us in work [32]. In our work we exploit the recent technique called the second level adaptation. Our aim is to design a controller which operates for different working conditions. Therefore, the adaptive VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ control is applied to design the controller. We will show that the operation of an adaptive controller with a second level adaptation, the convergence of all multiple models occurs for a long time. In such conditions, the benefits of multiple models are very limited or lost. Therefore, a new algorithm based on resetting multiple models to initial structure is proposed. Therefore, the main contribution of this work is the introduction of the MRAC with multiple models and resetting algorithm. Firstly, a new algorithm is shortly described on a motivation example. Then it is applied to DEAP actuator. The structure of this work is organized as follows. Section II describes the DEAP actuator model and its linearization. In Section III, the multiple model adaptation with resetting algorithm is introduced. Section IV shows the simulations of the presented schema for DEAP actuator. This section highlights the performance improvement caused by the resetting algorithm.

II. DEAP ACTUATOR MODEL
In our work we consider Dielectric Electroactive Polymer (DEAP) actuator biased with mass [9], [32]. Taking into account our previous works [9], [32], the nonlinear model is given by:ẋ where x 1 is the distance, x 2 is the strain of the damper, x 3 is the velocity and s(x 1 ) = 1 + . The stress σ e is defined as: where λ r is the stretch and fixed α i = 2, 4, 6. The actuator input u is the voltage and the output y is the distance. The DEAP actuator has an input nonlinearity u 2 , which is simple to compensate by introducing a new input v = u 2 . This approach was useful for instance in PID controller presented in work [33] for DEAP actuator. The presented model considers the mechanical stress of DEAP actuator and the electromechanical coupling. The mechanical stresses are described by viscoelastic and hyperelastic effect. The first is covered by two dampers represented by parameters k e , η e and b. The hyperelastic stress is represented by σ e (λ r ). The Maxwell stress combines an input voltage u with a mechanical part. The problem of DEAP actuator modeling is thoroughly discussed in works [1], [4], [6], [8], [10], [33]. The description of parameters are presented in Section IV along with their values.
In our work we consider an adaptive controller for DEAP actuator. The presented model does not have all state variables available for the measurement. It can be caused by the fact that, for instance, the strain of the damper x 2 is virtual rather than physical signal. Further, the velocity in our work is also considered as not available. Therefore, we design an adaptive control from an output feedback. This makes the problem more challenging. In our approach we take into account Indirect Model Reference Controller, which is well known for linear systems. The nonlinear model (1) is linearized around the working point [32]. The nominal state and nominal voltage is found by solving equation: This allows us to design a controller for a plant G DEAP (s) = k p The linearization of a nonlinear plant was used to obtain the coefficients of G DEAP (s) [32]. During the simulations, it was found that coefficients of G DEAP (s) vary for different working points. Therefore, the application of an adaptive control will allow to dynamically adjust to new parameter values.

III. OUTPUT FEEDBACK MULTIPLE MODELS MRAC WITH RESETTING
The Indirect Model Reference Adaptive Controller is a well-known technique that allows for an adaptive output feedback control. In recent work, the crucial improvement of this technique was presented in recent work [20]. The application of multiple models with the second level adaptation allows for fast error convergence to 0. Furthermore, it requires less models than the switching techniques presented in [18]. In this section, we would like to introduce the resetting algorithm for the second level adaptive law in the case of piecewise constant parameters. Also we would like to show a short motivation example why such algorithm is required.

A. MOTIVATION EXAMPLE
Let us consider the first order system: where a is the unknown but piecewise constant parameter, x is the state and u is the input. As in work [20], we construct identifiers (multiple models): with gradient adaptive law: where a m > 0 defines the identifier dynamics,x i is the estimated state,â i is the estimated parameter, γ is the adaptation gain, e i =x i − x is the identification error and i = 1, 2 because in the case of first order system with single unknown parameter only two virtual models are required. The estimated parameter is expressed as: whereα 1 ,α 2 are the second layer weights. If the closed-loop control system was considered, the parameterâ would be used to calculate controller parameters. The second layer coefficientα 1 is estimated as: where γ mm > 0 and are the second level adaption gain and error respectively. In work [20], the stability of such systems was proved in the case of constant parameters. Further, it was shown that the second layer improves the transients of adaptive systems. We want to analyse a multiple models algorithm in the case of piecewise constant parameters as this analysis was not taken into consideration in the previous works for a multiple model adaptive control with a second layer.  is set to 2 and gains are equal γ = 20 and γ mm = 60. The results are presented in Figure 1. The estimated parameterâ, a 1 ,â 2 converges to the plant parameter a. The behaviour of the system is different in the case of first parameter switch, than in the next switch. As long as parametersâ 1 ,â 2 converge to the same value, identifier e 1 and e 2 will also do the same.    It is clearly visible in Figure 1. This causes, in the case of the second switch, the second layer to be inactive. It is visible that derivative ofα 1 is almost equal to 0 after the first 10[s]. In such situation, the benefits of multiple model algorithm are not obtained. Hence we propose an algorithm of resetting multiple model structure which allows to cover the presented situation. We would like to reset the values of   identifier parametersâ 1 ,â 2 based on the available signals in the control system. Finally, we describe the main assumptions of resetting algorithm: • works based on the signals available in the control system, • notify about the switch of parameters, • re-initiate the structure of identifiers.

B. RESETTING ALGORITHM
The Indirect Model Reference Adaptive Controller with normalized adaptive law assumes only the availability of input and output signals. Therefore, the identifier uses an overparametrized model of plant to express system in input-output representation. The configuration is given by: where θ = θ 1 . . . θ j . . . θ 2n T is the plant parameter vector with transfer function coefficients, z is the filtered output and is the filtered input and output (see [16], chapter 2.3 and 6.6). In the case of multiple model extension the adaptive system consists of multiple identifiers and single controller. The number of identifiers depends on the order of system n and is equal to n + 1. Hence, the same number of estimated plants is required: whereẑ i is the estimated filtered output,θ i is the estimated parameters and i = 1, . . . , n + 1. The vectorθ i describes a single model of plant and it is calculated by one of available adaptive laws based on gradient, instantaneous cost function or integral cost function [16]. In the case of normalized adaptive laws, the error is expressed as: The set of all identifiers creates the first layer of identifier. The goal of the second layer is to calculate an estimation of plant based on the identifiers in the first layer. Based on work [20], the estimated plant parameterθ is defined as: whereα i is the estimated weight of single model. In the second layer the weightsα i are also estimated by adaptive law given by: for i = 1, . . . , n. The last one is obtained fromα n+1 = 1 − n i=1α i and is the error of the second layer: In work [20] it was shown that the existence of adaptation in two layers could provide a better transient performance of an adaptive system with constant parameters. However, in the case of piecewise constant parameters, as it was shown in the motivation example, the crucial point is to ensure the convex structure of estimated plant parametersθ i during the adaption process. This requires a resetting algorithm which will reconstruct the convex hull in the case of parameter value switch. Let us introduce a measure of the deviation of parameters. We define a vector: where j = 1, . . . , 2n. The standard deviation of p j is given by: (16) where p j is the mean of p j elements. The standard deviation s j describes how much the j parameter is divergence between multiple models. If the j parameter is the same for all multiple models, then s j is equal 0.
We propose the following reset condition: where s t is threshold for standard deviation and t is an threshold for identification error. If reset signal r is 1, the estimated parameters are set to their initial structure. For instance r is equal 1 at time instant t r , thenθ i (t r ) =θ i (0). The expression min j=1,...,2n s j calculates the minimal standard deviation between parameters in multiple models. Hence, the resetting occurs then the identifier have some parameter close each other and an identification error is high. The second condition is introduced to force resetting only if an identification error is high. It is worth mentioning that due to the properties of a standard deviation signal s j this resetting condition does not require hysteresis. After the reset, it has some initial value, which should be designed to be higher than s t . In the multiple models algorithm it is possible to choose the structure of parameters. The example of structure for system with order 1 and 2 is presented in Fig. 2. In the moment of switch the initial structure is set again to the estimated parameters. Hence the value ofθ before the reset differs from value after the reset. This will cause a switch in control u. From performed simulations, we conclude that such behavior has an adverse effect on control system. However, it is possible to preserve continuity ofθ. Let us denote asθ − ,θ + the value of estimated parameter before and after switch. The following expression: describes the estimated parameter after switch. In the moment of reset, the coefficientsα + i are free to choose. Hence, we calculate them to satisfyθ − =θ + from: The matrix M is invertible as long as θ 0 i is chosen by designer and must create a convex hull.
We would like to shortly analyze the algorithm for the motivation example. We set the threshold to s t = 0.05 and t = 0.05. The results are visible in Fig. 3. Initial value of s 1 (0) is equal to 1, then it converges to 0. The identification error also converges to 0. The parameter switch (at the time of 12[s]) causes that identification increases and forces the signal reset to 1. This also causes that s 1 come back to 1. It is important to notice that the reset time is not equal to parameter switch time. The identification error signal requires some time to increase, hence in our example the reset is after 0.38[s] and 0.69[s] after the parameter switch.
Another possibility is to use max function in the condition (17). In such case we would choose a parameter with the highest deviation. This means that structure will not be reset until all parameters converge to the same value. However, this is not practical in the case if the converge of parameters differs significantly.

IV. SIMULATIONS
In this section we present the simulations of DEAP actuator under the adaptive control. The nonlinear model of the device and its parameters are taken from our previous work [9]. The value of parameters are summarized in Table 1. The linear model G DEAP (s) is obtained by the linearization of a nonlinear plant in the working point (3) as in our work [32]. The value of G DEAP (s) is found for the nominal voltage u n = 3.5[kV ], which is equal to v n = 12.25[kV 2 ]. The degree of Z p (s) and R p (s) are 1 and 3 respectively. The transfer function high frequency gain, zeros and poles are presented in Table 2.
The parameters of G DEAP (s) depend on the nominal working point. To find how much the parameter vary for different working points, we calculate the coefficients of polynomials k p Z p (s) = b 1 s + b 0 and R p (s) = s 3 + a 2 s 2 + a 1 s + a 0 . The value of coefficients was found by a linearization of a nonlinear plant for a range 1.5 − 7[kV ]. Their relative values are presented in Fig. 4 and 5. The percentage change of parameters for the DEAP actuator transfer function is presented in Table 3.

A. ADAPTIVE CONTROL
In this section we present the adaptive control designed to DEAP actuator. We assume that the device is working in some nominal point for a period of time, and due to reference command or external signal (like load torque) is changing the working point. As stated in the previous part, the coefficients of the linear model are varying hence the adaptation is required to find new parameter values. In the case of MRAC the reference model is chosen with roots s 1m = −5 + j17.5 and s 2m = −5 − j17.5 and static gain equal to 1. The overparametrized plant model (10) is constructed by applying filter with roots λ = −10. The multiple models structure is applied in the identifier. Therefore, the number of identifiers is equal to 4 (the system has order 3). In the first layer the adaptation is performed by least square algorithm with gain P 0 = 100 and P 0 = 500. In the second layer the adaption gain g mm has value 5. The reset algorithm thresholds (17) are set as follows: s t = 2.0, t = 0.05. The Model Reference Controller is built based on the estimated parameters from the identifier. To perform the analysis of adaptive control systems, the three variants of controller are studied. The first is denoted as MRAC (Model Reference Adapative Control -without the multiple models and without resetting), the second as MM-MRAC (MRAC with multiple models), and the third as MM-RESET-MRAC (MRAC with multiple models and resetting algorithm). The performance indexes was calculated for the control system running the following trajectory: y r (t) = 1.5+0.75sign(sin( 2π (20) with T 1 = 288[s] and T 2 = 6[s]. This means that the parameter switch due to change of nominal point is after each 144[s]. The simulations are done for two levels of gains: P 0 = 100 and P 0 = 500. To limit the space the plots are shown for P 0 = 100. The identification and control error are visible in Fig. 6 and the zoom of the output is shown in Fig. 7 whre T f is the final time of the simulations. The function E mean (t) calculates the mean absolute of the control error for the time T 2 . The results for the three algorithms are presented VOLUME 8, 2020  in Fig. 8. The process of adaption is shown in Fig. 9 for single parameter k p . In the Fig. 10 the reset algorithm signal is shown. The structure is reset for the switch at the time 288[s] with small delay. The performance indexes are shown in Table 4 and 5 for gain P 0 = 100 and P 0 = 500 respectively. It is visible that for all indexes the control method MM-RESET-MRAC provides the transient improvement.

V. CONCLUSIONS
In this work, the indirect adaptive controller is designed for a DEAP actuator. The recent technique multiple models with the second level adaption is applied to obtain the transient improvement. Due to the parameter convergence for long time operation, the resetting algorithm was proposed. Additionally, the new signals, which describes the behavior of multiple models was defined. This approach allows to reset the multiple model structure, hence the long time operation under varying condition is possible. The quantitative improvement is shown based on performance indexes. In this work the main features are: a new algorithm to reset multiple models structure, the analysis of multiple models adaptation for long time operation and adaptive control scheme for a DEAP actuator. In the future works, the extension to nonlinear adaptive control methods can be applied [19], [34]. DAMIAN CIEŚLAK received the B.Sc. degree from the Poznan University of Technology, in 2018, and the master's degree in automation and robotics from the Faculty of Computer Science, Poznan University of Technology, in 2019. He is currently working on the control of electroactive polymers and adaptive control. VOLUME 8, 2020