A Precise Neural-Disturbance Learning Controller of Constrained Robotic Manipulators

An adaptive robust controller is introduced for high-precision tracking control problems of robotic manipulators with output constraints. A nonlinear function is employed to transform the constrained control objective to new free variables that are then synthesized using a sliding-mode-like function as an indirect control mission. A robust nonlinear control signal is derived to ensure the boundedness of the main control objective without violation of physical output constraints. The control performance is improved by adopting a neural-network model with conditioned nonlinear learning laws to deal with nonlinear uncertainties and disturbances inside the system dynamics. A disturbance-observer-based control signal is additionally properly injected into the neural nonlinear system to eliminate the approximation error for achieving asymptotically tracking control accuracy. Performance of the overall control system is validated by intensive theoretical proofs and comparative simulation results.


I. INTRODUCTION
During a few past decades, robots have played a crucial role in industrial, manufacturing, discovering, rescuing and day-life activities. Precise position controllers are required in most of industrial robots [1], [2]. Nonlinear uncertainties and unpredictable external disturbances effected from working environment are barriers in exhibiting excellent control performances [3]- [6]. Furthermore, in real-time control situations, robot joints work in limit regions. Violation of the physical constraints could activate many serious problems that make danger to the control systems [4], [5]. To realize control objectives under certain severe conditions, a vast of important research on output constrained control have been published. The advanced techniques were successfully applied to a 2 degree-of-freedom hydraulic robot arm [7] or a wheel inverted pendulum system [8]. Backstepping control methods have been recently favorite employed to deal with the constrained control objective [9], [10]. Designing procedures of the nonlinear controllers were mainly based on Barrier Lyapunov functions [11], [12]. Static and dynamical The associate editor coordinating the review of this manuscript and approving it for publication was Min Wang . constraints were investigated to bring the systems closed to practical uses [11], [13]. Sliding mode driving control methods were also interesting approaches whose design flowcharts were simpler than the backstepping ones [14], [15]. Soft boundaries were adopted to squash the control objectives converge to arbitrary vicinity following desired transient performances [16], [17].
To improve the control quality, the nonlinear uncertainties and external disturbances inside the system dynamics need to be tackled [14]- [18]. The system behaviors could be derived using typical analyses such as Newton-Euler or Lagrange methods, or decomposition principles [18], [19]. Such methods could only be possible to apply for simple or specific robots [7], [20]. To reduce the analysis effort of the classical methods, time-delay estimation approaches were considered as potential solutions [21], [22]. The total systematic dynamics could easily be computed from the acceleration signals measured and selected input-gain matrices [23], [24]. Success of the fast estimation approaches have been proven by theoretical proofs and real-time applications [24], [25]. Due to the use of high-order time-derivative terms, amplifying noise effect could overwhelm the original dynamics [26], [27]. To attenuate the unexpected VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ deviation in functional approximation process but still preserve the model-free properties, original disturbance observers are promising candidates [28], [14]. All dynamical effects, including nonlinear uncertainties and external disturbances, could be completely estimated by first-order or high-order observer structures. Strictly constrained-asymptotic observation performances were obtained using linear excitation functions [29]- [32]. The estimation power could be further enhanced with nonlinear activation functions [33]- [35]. As a result, excellent control outcomes were normally resulted in from the disturbanceobserver-based control [14], [28]- [36]. Adoption of a single channel to estimate complicated dynamics of robots under diverse working conditions however limited their performances especially for large disturbances in high working frequencies [36], [37]. This drawback could be overcome by integrating nominal system models into the disturbance techniques [15], [38]. Although ones lost little effort to build up the simple model, it is worth achieving higher control performances with this combination. Thanks to the ability of universal approximation, learning the system dynamics using black-box models such as neural networks or fuzzy-logic engines are interesting alternative remedies [39]- [41]. As comparing to the disturbance-observer structures, such the intelligent approaches yield better estimation efficiency for large model variation at high-speed working thanks to multi-channel learning characteristics [38], [42]. In intelligent robust control processes, Radial-basis function (RBF) networks or Fuzzy-hybrid-networks were activated by various information of control errors [43]- [45]. Since network convergences depended on the richness of the excitation signals, the controllers were difficult to yield outstanding transient control performances [42], [46]. As a solution, learning laws of such the networks have been modified by using linear leakage terms [4], [47]. The new learning rules could ensure finite bounds of weighting coefficients. Because the learning behaviors still worked outside of activation ranges, the convergence rates became slow in the whole process. Even though, neural networks could well estimate various nonlinear functions with arbitrary degree of accuracy, for excellent control performances, one needs to cope with the approximation errors remaining [11], [13]- [48].
Combination of neural networks and disturbance observers has attracted large attention from the engineers and researchers [49]- [51]. In [52], such the combination was successfully employed for stiffness control of a certain robot. The structure recently also showed impressive control results on exoskeleton or cooperative robots [34], [38]. However, adoption of leakage functions in neural networks obliged the control phase employing strong robust gains for asymptotic control performance. Furthermore, integration of neural network and disturbance still ensure certain bounds of the control error.
Motivated by the merging techniques, in this paper, we proposed a new adaptive robust controller for high-precision position tracking control of constrained robotic manipulators. Constraints of the control objective is first tackled using a nonlinear transformation function. A sliding-mode-like control framework is employed to drive the control objective to a certain bound around zero. A proper combination of the neural network and disturbance observer is utilized to force the closed-loop system converge to zero in infinite time. Contributions of the proposed controller is summarized as follows: • A new constrained nonlinear controller is developed as a multipurpose background in which the control objective is at least stabilized inside arbitrary vicinity of zero without violation of the physical constraints.
• Nonlinear uncertainties and external disturbances in the system dynamics are efficiently dealt with by an RBF neural network which is connected with the controller by nonlinear learning laws.
• A nonlinear disturbance observer is then properly injected to the neural control signal to eliminate the neural approximation errors for maintaining asymptotic stability of the overall system.
• Effectiveness of the proposed controller is verified by conditioned Lyapunov-based theories and intensive simulation results. Remainder of the paper is organized as follows. General dynamics of the robots and problems statements are presented in Section II. The proposed controller incorporated with neural network and disturbance observer are designed in Section III. Validation results of the whole control system are discussed in Section IV. Finally, the paper is concluded in Section V.

II. SYSTEM MODELING AND PROBLEM STATEMENTS
Dynamics of a serial n-joint robot is generally formulated as follows [18], [19]: where q,q,q ∈ n are the vectors of joint positions, velocities, and accelerations, respectively, τ ∈ n is the vector of the joint torques or the control input, M q ∈ n×n is the symmetric-positive inertia matrix, (C q,q q) ∈ n is the Coriolis/Centripetal vector, g q , f q ∈ n are the gravitational and frictional torques, respectively, and τ d ∈ n are external disturbance torques. Remark 1: The disturbance τ d presents influence of external environments on the system dynamics. For real-time robotic applications, the maximum powers of the systems are finite values. They only possibly complete their own missions under effect of finite-energy external disturbances [38], [53]. Hence, the following assumption is reasonable: The disturbance (τ d ) and its time derivative (τ d ) are assumed to be bounded.
Remark 2: The system is a passivity model with bounded time-derivative states [18], [21], [35]. Robot joints are furthermore limited in physical ranges: where (q and q) are respectively the upper and lower physical bounds of the system output (q). In fact, physical collisions could activate unexpected impacts and make the system danger.
Assumption 2: The desired trajectory (q d ) is a known, bounded, twice continuously differentiable signal and lies inside the physical range (q;q) of the system output. The system states (q,q) are measurable.
Remark 3: The main objective of this paper is to design a proper model-free controller to drive a tracking error of the system output (q) and a reference profile (q d ) to zero or as small as possible and ensure the feasible range of the system output. This task is challenged by problems of unknown nonlinear dynamics, uncertainties, modeling errors, complicated disturbance from different working environments, and collision avoidance.

III. CONSTRAINED NEURAL-DISTURBANCE LEARNING NONLINEAR CONTROLLER
In this section, a proper procedure is employed to design the adaptive robust controller based on constrained sliding mode scheme incorporated with learning ability of a neural network and disturbance observer. Stability of the overall system is then verifying by theoretical analyses.

A. CONSTRAINED NEURAL SLIDING MODE CONTROL
First, the dynamics (1) could be simplified as follows is a lumped dynamical term that is combined from the Coriolis/Centripetal vector, the gravitational, frictional, and external disturbance torques, andM = diag [m 1 ,m 2 , . . . ,m m ] is a selected diagonal positive-definite matrix.
With serial manipulators, the lumped dynamics (v) are bounded [18], [52] but its detailed description is not easy to derive [19]. To assist this work, an RBF neural network could be employed as a corresponding approximator. The dynamics v = [v 1 , v 2 , . . . , v n ] T could be expressed as the following linear combination: where (v dc , τ dc ) and (v c , τ c ) are respectively discontinuous and continuous portions of the lumped dynamics and control input, ξ i q,q, τ c , w i , δ i are respectively regression, optimal weight, and neural approximation error vectors.
Approximationv ci of the dynamics v ci is designed as [16], [52] here,ŵ i is estimate of the weight vector w i . The main control error is now defined as follows The constraint (2) yields the following feasible range of the main error: where (ē and e) are respectively the upper and lower physical bounds of the system output (e). A free-constrained variable is then employed as a projection of the constrained error on another space using the following transformation: where σ = [σ 1 , σ 2 , . . . , σ n ] T is the transformed error, e i is an specific entry of the control error vector e = [e 1 , e 2 , ..e i ., e n ] T , and ρ e i is a step function: A nonlinear sliding manifold is synthesized as an indirect control objective of the studied system: where K 0 = diag k 0 = diag [k 01 ; ...; k 0n ] is a positive-definite diagonal gain matrix.
By differentiating the manifold (9) with respect to time and noting the dynamics (3), we havė From the system (10), the final control signal can be simply designed to stabilize the tracking error to zero in infinite time using the following structure: Roles of the detailed control signals (τ MOD , τ DRI , τ ROB ) are explained hereafter. τ MOD is a model-compensation signal that is used to eliminate the internal dynamics (v) and other terms of the manifold dynamics (10). Hence, the signal is structured as follows: By applying the dynamical control signal (12), the dynamics (10) of the sliding manifold could be reformed aṡ whereṽ c =v c −v c ∈ n is an overall continuous estimationerror vector. The role of the signal τ DRI is to force the sliding manifold from a certain initial position back to around zero. Thus, it is designed as where ...; k 1n ] is a positive-definite diagonal driving gain matrix, p = [p 1 , p 2 , . . . , p n ] T is a vector of positive constants such that p i|i=1..n ≥ 1.
The last signal τ ROB is a robust control term that is used to suppress the overall error (ṽ c ). The signal is selected as follows: where ...; k 2n ] is a positive-definite robust gain.
The manifold dynamics (13) becomė Remark 4: The dynamics (13) reveal that once the estimate (v c ) is bounded, the closed-loop system is stabilized in an arbitrary vicinity of zero. A large robust gain (k 2 ) that satisfies a condition of (min eig K 2 M −1M > max ṽ c ) could theoretically yield the asymptotic control performance, but it could activate chattering phenomena. On contrast, the control precision is degraded with small robust control gains.
Control performance of the neural sliding mode system is evaluated by the following statements.
Theorem 1: Given serial robotic dynamics (1) within an output constraint (2) and Assumptions 1, 2, and employing a robust control laws (6)-(15) under learning rules (5), (17), the closed-loop system is asymptotically stable from any initial constrained conditions if the proper control gains are selected satisfying Theorem 1 is proven in Appendix A. Remark 5: If the estimate (v) approaches the real dynamics (v) with an arbitrarily small accuracy, the robust burden is significantly reduced. At that time, employing small robust gains would provide excellent control performances.

B. INTEGRATION OF DISTURBANCE OBSERVER CONTROL
Dominant terms of the lumped dynamics (v c ) could be learnt by the network. The neural sliding mode control structure exhibits the outstanding control performance with large robust control gains K 2 . To alleviate the discontinuous terms in the control signal, it is required to compensate for the neural approximation error (δ) by a smooth estimation technique. Hence, an additional disturbance-observer-based control term is an understandable candidate. The following assumption is taken into account: Assumption 3: The neural error (δ) and its first-order time derivative are bounded. Its dynamics are thus synthesized aṡ where diag α = diag [α i , α 2 , . . . , α n ] is a positivedefinite diagonal constant matrix. ζ = [ς 1 , ς 2 , . . . , ς n ] T is a virtual bounded disturbance vector.
As noted in [35], the disturbance observer (21) demonstrated excellent learning efficiency for simple systems. However, the observer execution with a MIMO system especially in combining with another parallel learning phase, such as neural networks, is definitely a different story. To integrate the new designed signal (20)- (21) to the working system, the learning rule of the network is modified asẇ where diag η 2i i=1..n is a positive-definite diagonal constant matrix. Control performance of the overall system is investigated by the following confirmation.
Theorem 2: Given serial robotic dynamics (1) within an output constraint (2), Assumptions 1, 2 and 3, and employing a robust control laws (6)- (15) under learning rules of the neural network (5), (22) and disturbance observer (20), (21), the closed-loop system is asymptotically stable from any initial constrained conditions if the proper control gains are Theorem 2 is proven in Appendix B. Remark 6: Theorem 2 implies that the robust control gain (k 2 ) does not need to be larger than a certain value for an excellent control performance. Obviously, the integral learning gain (k 3 ) would take on the responsibility of the robust burden.
Remark 7: Once the indirect control objective is stabilized at origin, the main control objective will be then realizing under a sliding phase [24], [26]. Employing the nonlinear combination (9) would speed up the convergence time of the sliding process [16], [17]. Overview of the proposed controller is graphically summarized in Fig. 1.

IV. VALIDATION RESULTS
Performance of the proposed controller was assessed by intensive numerical simulation. A linear neural-disturbanceobserver backstepping (LND) controller and conventional Proportional-Integral-Derivative (PID) controller were implemented on the same system as benchmarks of the control-efficiency comparison. Design of the LND controller is presented in Appendix C. Simulation results obtained are discussed in the following section.

A. SIMULATION RESULTS
The comparative and proposed controllers were applied for position-tracking control on simulation of a 3-Degreeof-freedom (DOF) robot, as sketched in Fig. 2. Dynamics of the robot could be easily derived using classical methods works [4], [18], as presented in Appendix D. Regression vector (ξ q,q, τ c ) of the RBF network was built up from 9 3 nodes which were encoded from 9 inputs (q i ,q i , τ ci ) |i=1,2,3 using Gaussian functions [35], [52]. From the design (11), the continuous control signal was τ c =M(τ MOD +τ DRI +δ). All of initial values of the weight vectors (ŵ i|i=1,2,3 ) were selected to be zeros. Other simulation parameters of the controllers and the dynamics are shown in Tables 1 and 2, respectively.   In this first simulation, sinusoidal trajectories with different frequencies (0.1 (Hz), 0.3 (Hz), and 0.5 (Hz)), as depicted in Fig. 3, were selected as desired profiles of the robot  As shown in Fig. 5, the PID controller could ensure stability of the closed-loop system with good control errors: 1 (deg) at joint 1, 3.9 (deg) at joint 2, and 5.6 (deg) at joint 3. However, as observed in Fig. 4, transient PID output violated the physical constraints. To void the danger collision, one possible way is improvement of both transient and steady-state errors. To efficiently suppress the nonlinear uncertainties for higher control performance, the LND controller employed a combination of neural network and disturbance-observer estimation inside a backstepping control scheme. As a result, excellent control accuracies were provided by the LND control method: 0.095 (deg) at joint 1, 0.11 (deg) at joint 2, and 1.9 (deg) at joint 3. Another alternative solution of the physical-output avoidance is adoption of constrained control algorithm proposed. Furthermore, nonlinear neural disturbance combination was used as well to promptly eliminate the systematic dynamics for impressive control precision. Control performance of the proposed controller was thus remarkably increased as comparing to the PID one, in which control errors of joints 1 and 2 were 0.1 (deg) and 0.12 (deg), respectively. As indicated in Fig. 5, the control errors of the LND and proposed controllers were not much different at low frequency working conditions. Nevertheless, high frequency control is another interesting problem. By working under nonlinear learning laws proposed, control accuracy (0.4 (deg)) of the proposed controller at 0.5 Hz sinusoidal trajectory was higher than that of the LND controller. Estimation effect of the neural network and disturbance observer is shown in Fig. 6.
In the second simulation, the frequencies of the desired trajectories of joints 1, 2, and 3 were changes to be 1 (Hz), 0.3 (Hz), and (0.7 Hz), respectively. The new desired profiles are shown in Fig. 7. Furthermore, a sinusoidal signal of τ d3 = 10 + 50 sin 4π t was created as an external disturbance   effecting to joint 3. Applying the same controllers to the robotic system, the results obtained are shown in Fig. 8.
Without any adaptation feature, performances of the PID controller were seriously degraded with severe working conditions: new control errors at joint 1 and 3 were 10.9 (deg) and 12.9 (deg), respectively. By possessing the intelligent estimation technology, the control precision of the LND controller was restrained in acceptable ranges: control accuracies of joints 1 and 3 were 0.7 (deg) and 7.5 (deg), respectively. As carefully observed in Fig. 8 and as proven in the previous work [38], the LND controller however only ensured a finite bounded control error instead of the asymptotic one. The shortcoming was completely dealt with by the proposed learning mechanism. Nonlinear estimation rules were properly designed such that the control and estimation errors could converge to zero or as small as possible. As also seen in Fig. 8, gradually reduction of the proposed control error with respect to time implies that the nonlinear uncertainties and the external disturbance were approximated well by the collaborative learning mechanism. Hence, these results could confirm the overall effectiveness of the designed controller.

B. DISCUSSION
By recalling the results of the two simulations, particularly in Figs. 5 and 8, control performances of the two neuraldisturbance-based controllers were almost same at low frequency reference signals, but significantly divergent at high    frequency cases. The improvement of the proposed controller came from the success of the nonlinear learning laws. As observed in Figs. 7 and 9, the neural network played as key terms in estimating system dynamics while the disturbance observer would cover approximation errors remaining. Furthermore, as presented in Fig. 9, the estimation results of the neural network and disturbance were stabilized in certain ranges that resulted in smooth control signals, as illustrated in Fig. 10.
The maximum absolute (MA) and root-mean-square (RMS) values of the control performances for a specific manipulated time (80s to 90s) are presented in Table 3 . The proposed control controllers always provided the best RMS error even though in some cases its MA performances were not the highest one. The ratios of RMS/MA errors of the PID, LND and proposed controllers were in range of 0.65, 0.4, and 0.3, respective. These factors indicate the nonlinear uncertainties and disturbances were efficiently compensated for by the proposed control technology. Hence, the analytical and testing results have proved the outperformance of the studied control method over the previous ones.

V. CONCLUSION
In this paper, an adaptive robust controller is proposed for high-precision position-tracking control of constrained robotic manipulators based on a new neuraldisturbance-based sliding mode scheme. A nonlinear control signal is generated to realize the control objective within feasible physical output constrains. Influence of the systematic nonlinearities, uncertainties, and unpredictable external disturbances on the control performance are suppressed by neural network estimation throughout a modified learning law. A special disturbance observer was properly integrated in the developing control framework to obtain higher control accuracy by eliminating the remaining approximation error. Control performance of the closed-loop system was intensively verified by theoretical proofs and extended simulation results.

APPENDIX A PROOF OF THEOREM 1
We consider the following Lyapunov function: By noting the dynamics (16) and (17), the time derivative of the function (1) iṡ If the condition (18) is satisfied, there always exist two positive constants λ i1|i=1..n , λ i2 complying with the following inequality: This leads to the proof of Theorem 1.
We synthesize a new Lyapunov function: where L 20i|i=1..n is a positive constant selected as [35]: Differentiating the function (B. 2) with respect to time and employing the dynamics (B. 1) and (22) lead to the following inequality: Under the gain constraint (23), there always exist another constant λ i3|i=1..n such that: It means that Theorem 2 has been proven.

APPENDIX C REDESIGN OF COMPARATIVE LINEAR NEURAL-DISTURBANCE-OBSERVER BACKSTEPPING CONTROLLER
The linear neural-disturbance-observer backstepping (LND) controller is designed based on a previous work [38]. Without loss of generality, the comparative controller is derived in specific joint perspective. From the main control objective (6) where k c0i|i=1..n is a positive control gain. The final control signal of the control scheme is then derived as where k c1i is a positive control gain, p = [q,q,ż,ṙ] T is the input variable of the regression vector ψ i|i=1..n of the RBF neural network.ŵ i|i=1..n is the weight vector of the neural network, and is updated by the following law: where µ i is a positive leakage rate, and i is a positive-definite diagonal matrix. ϕ i|i=1..n is estimate of systematic disturbances, and is computed throughout an auxiliary variableφ i|i=1..n , that is estimated by the following learning mechanism: where, k c2i is a positive disturbance gain selected.

APPENDIX D DYNAMICS OF A 3-DOF ROBOT
The detailed dynamics (1) of the robot, as sketched in Fig. 2, are derived as follows: where q i , l i , m i and a i|i=1,2,3 are joint positions, link lengths, link masses and frictional coefficients, respectively; g 0 is the absolute gravitational-acceleration value; and c i , s i , c ij and s ij stand for cos(q i ), sin(q i ), cos(q i + q j ) and sin(q i + q j ), respectively.