Fault-Tolerant Control for AUVs Using a Single Thruster

The present paper presents a fault-tolerant control for an AUV in the presence of a critical failure in the actuators that may require an emergency operation to recover it or to drive it to a safe point. In this context, the control scheme proposed deals with a single thruster in operational conditions to command the vehicle towards a desired direction and reach a safe target point. In addition, the AUV is commanded with only two control actions on the available thruster, driving the vehicle through the desired direction following a spiral-like path and keeping it within the neighbourhood of the target point. The fault-tolerant control proposed is simple and robust enough to be applied to multiple kinds of AUVs without the need of accurate parameter design. The stability and well performance of the control scheme proposed is analytically demonstrated, and simulation examples illustrate the key results derived.


I. INTRODUCTION
Oceans cover approximately 70% of Earth's surface and represent the largest reservoir of life and resources. The interest in investigating, exploring and exploding the ocean resources and the preservation of the marine life and environment has been increasing over the last years, and many efforts are being undertaken in this area to develop efficient and reliable tools.
In recent years, the use of Autonomous Underwater Vehicles (AUVs) is increasing significantly and they are becoming ubiquitous in multiple commercial and research operations at sea, due to their reliability and relative low cost. In addition, their flexibility, adaptability and autonomy make them a powerful tool to replace remotely operated vehicles (ROVs) and also humans in the execution of many demanding tasks at sea. The latter include offshore inspection, pipeline inspection, ocean exploration, habitat mapping, archaeological site mapping, localization and tracking of underwater targets, etc. Furthermore, their use in collaborative tasks allows for the realization of complex missions, often with relatively simple systems, see for example [1]- [3] and the references therein.
Many AUV missions are carried out on harsh environments and involve critical systems, so the behaviour of the AUVs The associate editor coordinating the review of this manuscript and approving it for publication was Shihong Ding . must be reliable and efficient, and they must work correctly for extended time. However, in practical applications, the malfunction or failure of any or some of the AUV systems and actuators is common. A failure on the AUV may result in the loss of the vehicle, and even put environment, or human and animal lives at risk. Therefore, a fault-tolerant system becomes essential to keep the control of the vehicle and recover it in case of critical failure.
A fault-tolerant system is composed of three modules: fault detection, fault isolation and fault accommodation [4]. Fault detection deals with the problem of recognising that there is a failure in the vehicle that prevents it from a correct operation. Fault isolation deals with the identification of the cause of the failure and its location. Finally, fault accommodation is concerned with the control of the vehicle to execute a desired task in the presence of failure. Fault detection and isolation have been extensively studied, the reader is referred to [5] and the references therein. In this work we will focus on the third module, fault accommodation, to develop a faulttolerant control strategy for the AUV motion execution in case of critical thruster failure.
We can find multiple solutions and approaches about faulttolerant diagnosis and control of AUVs in literature. For example, in [4] and [6] the authors investigate fault-tolerant control schemes for an AUV with thruster redundancy. VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ In these works, the existence of redundant thrusters is exploited to overcome failures during operation by thruster force allocation of the remaining actuators. In [7] the authors deal with the dynamic modelling of the propulsion system of an AUV together with a fault-tolerant control by reallocating thruster control forces. In [8] a reconfigurable layout for an inspection vehicle is developed. This layout is based on four pivoted thrusters than can be configured depending on the requirements and mission, and provides robustness in case of failure of one of the thrusters. In [9] a dynamic surface fault-tolerant control for ROVs is presented for trajectory tracking operations. The vehicle considered is fullyactuated, and a fault-tolerant thruster allocation policy is used in order to distribute the moments and forces among the remaining thrusters, showing good performance for tracking operations.
The above works consider the existence of multiple thrusters and actuators that can compensate the failure of one of them by reallocating and reconfiguring the forces provided by the remaining thrusters.
Other works study the situation in which the thruster failure reduces the number of degrees of freedom (DOF) of the vehicle. For example, in [10] geometric control theory is used to design trajectories to recover an AUV after an actuator failure. For this purpose the AUV is modelled as a forced affine connection control system, and the control is carried out using integral curves of rank one and kinematic reductions. In this case, the actuator failure implies a reduction of the available DOF, making the vehicle underactuated. In [11] a method for computing online the controller of an AUV under thruster failures is proposed. To this end, an optimization problem on a specific class of functions is defined to compute an optimal control law without using the faulty thruster. This work is extended in [12] where an online learning-based method to discover control policies to overcome thruster failures is proposed. The method is applied to a fully-actuated vehicle, which may become underactuated after the failure of one of the thrusters. For this purpose, a linear function approximator is considered to represent the fault-tolerant policy, whose parameters are learned depending on the AUV model and the mission. The AUV model is learned on board, and then it is adapted when a failure occurs. In [13] a fault-tolerant control for underactuated vehicles is developed by using a composite neural learning fault-tolerant control for path following, considering event-triggered input. The actuator faults considered are partial loss of effectiveness and bias fault, both for the rudder servo and the main engine of the vehicle, and the control scheme is able to compensate the actuator failures.
In addition, fault-tolerant control may be applied to many different kinds of vehicles and systems, see for example [14], [15] or [16]. In [14] the fault-tolerant control of a freeswimming robotic fish is studied, which is based on a feedback controller and a feed-forward compensator. In [15] the authors developed a robust fault-tolerant H ∞ control with adaptive compensation. A dynamic output feedback controller, which guarantees the stability and adaptive H ∞ performance of the closed loop system, is designed by stabilizing controller gains and adaptive control gain functions, and it is applied to a linearized dynamic aircraft system. In [16] a quantized non-fragile feedback control problem for active suspension systems with and without actuator faults is studied, where ride comfort, road holding ability and hard constraints on suspension deflection and actuator are considered in a multi-objective control problem.
It is important to notice that for an adequate fault-tolerant control, the identification and isolation of the problem is imperative. In this sense, we can find some works that deal with the complete system, such as [17], where an observerbased fault diagnosis approach is studied to detect, isolate and identify actuator faults of an AUV, where SVM is used to compensate the unmmodelled dynamics. In [18], the authors propose an actuator fault-tolerant control scheme, which is composed of detection, isolation and accommodation of fault modules, and a sliding mode approach is used for the vehicle control and fault isolation. The vehicle configuration shows redundancy of actuators, which is exploited to deal with the failure and continue with the mission objectives. In [19] an algorithm based on a Bayesian Filter that detects failures in thrusters in run-time is developed. This algorithm adapts the vehicle's dynamical model to take into account the faulty thruster. The approach is applied to a fully-actuated vehicle where one of the thrusters has a malfunction. In [20] a finitetime fault-tolerant trajectory control using integral sliding mode manifold is developed. In addition, a fault-estimator is designed to detect, isolate and accommodate unknown faults and disturbances in a 3-DOF fully-actuated autonomous surface vehicle (ASV) model.
Other interesting approaches are [21], where a faulttolerant lateral control against actuator faults is developed using a Kalman Filter for identification of the control distribution matrix and a linear quadratic regulator (LQR) is used for the controller; [22] where an intelligent monitoring and emergency system is developed based on an ontological approach, which consists of three blocks, one for the formation of knowledge on fault diagnosis, a translator to convert knowledge in code, and the emergency system; and [23], where an adaptive fault-tolerant control method based on virtual closed-loop system and an improved second-order sliding mode observer are developed and tested for an overactuated vehicle. In [24] an adaptive sliding model fault-tolerant fuzzy tracking control is developed. A Takagi-Sugeno fuzzy model is considered. This strategy tries to ensure the accessibility of the sliding motion in case of actuators faults, being able to deal with 2 simultaneous actuators faults considering a fully-actuated marine vehicle. Reference [25] also considers a Takagi-Sugeno fuzzy model for designing a faulttolerant control for dynamic positioning, where quantized feedback sliding model control is used. In [26] a faultdiagnosis observer and a fault-tolerant control are designed for ASVs in network environments. Fault-tolerant control is based on the observer to compensate the actuators faults. A sliding mode fault-tolerant control with signal quantization is described in the presence of time-delay and various thruster faults in [27], and in [28] a fault-tolerant control using integral sliding mode output feedback is used for compensating for thruster faults and disturbances, where thruster redundancy is assumed to fully compensate for stuck fault. For a deeper review on fault-tolerant control systems and approaches, the reader is referred to [29] and the references therein.
Most of the above works deal with overactuated vehicles, which have redundancy on the actuators and can reconfigure the forces and moments to keep the vehicle overactuated, or which become underactuated but with enough control actions to keep an adequate behaviour and compensate for the failure. In contrast to the previous references, the work at hand studies the case in which the vehicle is underactuated even in its nominal operation, similarly to previous works of the authors involving marine vehicles, both for control and identification [30]- [34]. In this kind of vehicles a failure or malfunction may cause a dramatic reduction of the control actions that can be applied over the vehicle. Thus, the scenario in which a critical failure reduces the available DOF of an underactuated vehicle is studied. Moreover, we consider the extreme situation in which only one thruster is available to control the vehicle dynamics. In this situation, a constant input on the available thruster will provoke the vehicle to turn in circles. Therefore, the design of an adequate control law so that the resulting trajectory, in average terms, follows a desired direction and drives the vehicle to a safe point is of the utmost importance. Considering that a discrete set of control inputs can be applied to the remaining thruster, namely two control inputs for the scenario at hand, the objective is to design the switching times of these control actions so that the vehicle moves in a spiral-like path, which follows, in average, a desired trajectory. This allows to define not only a simple, but also a robust and effective control law that drives the AUV towards a desired target point. The strategy proposed can be applied to multiple vehicles since it does not need a detailed model of the vehicle, and it does not require an accurate control parameters tuning, providing satisfactory results even for non optimal parameters. Therefore, the contributions of the present work are threefold: • i) A control strategy is proposed to drive underactuated vehicles to a safe recovery point in the extreme situation in which only one thruster is in operational conditions. The control strategy allows to drive the vehicle towards a desired direction, following a spiral-like path that will depend on the set of forces applied to the available thruster.
• ii) The limit case in which the only remaining thruster is controlled with just two control actions is considered. This assumption looks for a simple and effective control law that can be applied to multiple vehicles and situations, and also to consider the alternative scenario in which the remaining thruster has also a malfunction and only a discrete set of control actions can be applied. • iii) The stability of the control law proposed, and the convergence of the trajectory to a neighbourhood of the desired target point are demonstrated analytically. The paper is organized as follows. In Section II the notation, problem formulation, and the kinematics and dynamics models are presented. In Section III the control law to drive the vehicle to a safe recovery point in case of critical failure is proposed. In addition, the ability and stability of the control law to drive the vehicle along a desired average direction and to keep the vehicle in the vicinity of a selected recovery point are analyzed and demonstrated analytically. The main results are illustrated in Section IV where the good performance of the control law is shown through several simulations under different conditions. Finally, in Section V the conclusions and future work are presented.

II. PROBLEM FORMULATION A. NOTATION
For the sake of completeness and to ease the readability of the forthcoming sections, in Table 1 the different symbols and variables used for the theoretical analysis are described. Notice, that these symbols and variables are also defined along the paper at their first appearance.

B. VEHICLE'S MODEL
Consider an AUV composed of two missile shape bodies such as the MEDUSA Class AUV depicted in Figure 1. This kind of vehicle, which has two frontal thrusters to control surge speed and yaw rate, and two vertical thrusters for depth control, is underactuated. The AUV has positive buoyancy, which implies that in case of failure, it will emerge without the need of additional control actions. For this reason, the emergency control of the vehicle is studied in the horizontal plane, i.e. in 2D, once the vehicle is at the sea surface.
Following the standard convention [35], the origin of the AUV body-frame is located at its centre of mass, Figure 2. The speeds of the vehicle referred to the body-frame are ν ν ν = [u, v, r] T , where u is surge speed, v is sway speed, and r is yaw rate. The velocity in the inertial frame is V = [ẋ,ẏ] T , whereẋ is the speed along the X-axis andẏ is the speed along the Y-axis. Finally ψ is the angle that the vehicle orientation forms with the X-axis and its derivativeψ = r is the yaw rate. Then, in the absence of ocean currents, the kinematic model of the vehicle yieldṡ As mentioned above, in the 2D scenario considered and under normal operation, the vehicle is controlled by two frontal thrusters whose inputs are a s and a p , which are normalized reference commands for the internal controller of the propellers in the range [-100 100], i.e., the angular speed reference for the starboard and port thrusters, respectively. Then, each propeller produces a force that is proportional to the square of its own angular speed, which is positive (push forward) when the control action is positive, and negative (push backward) when the control action is negative. Thus, the forces F s and F p produced by each thruster in response to the control actions are: where K is the proportional constant which relates control actions to thrust. Then the total force is the summation of the force provided by each of the thrusters, F = F p (a p ) + F s (a s ), which also produces the torque τ = L · (F p (a p ) − F s (a s )), being L the distance of each of the thrusters to the symmetry axis of the vehicle.
In the emergency situation studied it is assumed that one of the thrusters becomes faulty and cannot be operated.
For the sake of simplicity, and without loss of generality, the starboard thruster has been considered as the faulty one, and a s = 0, F s (a s ) = 0. Thus, only the control action a p is available, and the total force and torque become F = F p (a p ) and τ = LF p (a p ), respectively, allowing only a partial control of the surge speed and the yaw angle. Therefore, according to [36], [37] and given the above constraint, the dynamics of the vehicle yields where m u , m v and m r are the mass and inertia constants (mass of the vehicle plus added masses that arise from the interaction with the surrounding water), and m uv = m u − m v . The Coriolis terms v · r, u · r, and u · v are caused by the fact that the body frame is rotating.
are the hydrodynamic coefficients, which are negative and then D u , D v and D r are positive. Therefore, the objective is to design a control action over a p so that the system (1)-(6) converges to a neighbourhood of the origin from any initial condition, i.e., the AUV can be driven with the use of a single thruster to a desired (recovery) area.

III. CONTROL DESIGN
The AUV velocity and position control are derived for the case in which a critical thruster failure occurs and only one thruster is available. Moreover, this remaining available thruster may also have a limited set of actions due to possible additional damage.
Notice that all the state variables x(t), y(t), ψ(t), u(t), v(t) and r(t) are time-dependent functions. Nevertheless, in the following, for the sake of clarity and with an abuse of notation, the explicit indication that the state variables are timedependent will be omitted when there is no ambiguity.

A. AVERAGE VELOCITY CONTROL
In the situation where one of the horizontal thrusters is not working, the AUV should be controlled and driven to a desired safety point. Then, a control action such that the vehicle moves, in average, towards a desired direction must be defined. It is important to notice that, as the single control action a p is used and it impacts directly on both the surge speed u and the yaw rate r, it is not possible to make the vehicle follow a straight path with a certain desired reference surge speed and orientation, as it would be expected if both control actions a s and a p could be applied.
This fact can be seen by noticing that a constant orientation ψ implies that r = 0, by Eq. (3). Thus, (5) becomes m vv + D v v = 0, which has a global asymptotically stable equilibrium point at v = 0. The latter is straightforward to prove by considering the Lyapunov where the left hand side tends to 0 as t → ∞, so F p (a p ) → 0. Therefore, u → 0 too, since (4) reduces to m u ·u + D u · u = 0 which is also globally asymptotically stable, and it is not possible to converge to a sustainable forward velocity with constant orientation.
At this point, it is important to analyze the AUV behaviour when a constant control action a p = a 0 is applied. In this situation, the equilibrium solution of (4)- Let's suppose that this equilibrium point is globally asymptotically and exponentially stable. In a normal situation, this is usually true since vehicles are designed to have a stable dynamics for constant control actions (the existence of such points will be shown later in Lemma 4).
The yaw rate equation in the time domain can be written as r(t) = r e + r t , where r t is a transitory term that, due to the exponential convergence, can be bounded as being c 1 and c 2 two positive constants. Thus, the yaw angle can be defined as Thus, the average angular speed yields The same analysis holds for u(t) and v(t), considering u t and v t the transitory terms of the surge speed and sway speed, respectively.
Therefore, from Eq. (1), the average velocity over the X-axis of the inertial reference frame can be computed as: By letting e ψ = ψ t − ψ f , the first term of the integral in (10) can be rewritten as: where ξ 1 (s) tends exponentially to zero as s → ∞ since ψ t tends exponentially to ψ f , so e ψ → 0. Following similar computations, the second term in (10) yields sin(r e ·s+ψ t ) = sin(r e · s + ψ f ) + ξ 2 (s), where ξ 2 (s) also tends exponentially VOLUME 10, 2022 to zero as s → ∞, and thus, the average velocity over the X-axis of the inertial reference frame becomes: Notice that this average speed is zero because the integral of exponentially decreasing functions are bounded, and the trigonometric functions that are periodic with zero average over a period are also bounded. The same analysis holds foṙ y, the average speed over the Y-axis of the inertial reference frame. This result has an intuitive explanation. Notice that when enough time has passed, the vehicle will move with constant speed V e = u 2 e + v 2 e and constant angular velocity r e . Thus, the trajectory will be a circle of radii R e = V e r e , and the average advance velocity is zero.
In order to obtain an non-zero average advance velocity, the symmetry of the circular trajectory must be broken, so a constant input control action is not suitable. The following control law is proposed: The concept within (11) is that the control action will be different when the vehicle is oriented to East (sin(ψ) > 0) than when the vehicle is oriented to West (sin(ψ) < 0) breaking the symmetry of the circular trajectory, see Fig. 3. Notice that the sign of the control law (11) changes when ψ = nπ, with n = 0, 1, 2, . . . . The value of ψ for changing the sign of (11) has been chosen for simplicity of explanation, and any other angle could be chosen without loss of generality. In fact, the selected angle will depend on the desired direction of movement, as it will be shown in the forthcoming analysis.
The control law works as follows: Consider that the vehicle turns clockwise (r > 0) describing an approximately circular trajectory (for a small) and pointing to the interior of the circular path because u > 0 and v < 0. Therefore, the velocity vector of the vehicle, which is tangent to the trajectory, points to port. When ψ = 2nπ the vehicle points exactly to the North, which is the frontier at which the vehicle changes its orientation from West to East (point A of Figure 3), and a p switches from a p = a 0 − a to a p = a 0 + a, following the red path of Figure 3. Once ψ reaches (2n + 1)π, the vehicle points exactly to the South (point B of Figure 3) and the control law switches again to a p = a 0 − a, describing the blue path. Since the control action is different in both parts of the path (red and blue) the trajectory does not close, i.e. it is not a circumference. Thus, the vehicle moves in ''some'' direction, which depends on the angle or condition to switch the control law. This pattern is repeated periodically producing an average movement of the vehicle on a certain course.
In order to prove the above statement, and before starting the stability analysis of the control law proposed, we must resort to some technical results: such that x x x e is an exponentially stable equilibrium point of the system and the first derivative of f f f is bounded and Lipschiz on x x x in a domain x x x − x x x e < c 0 , where c 0 is a positive constant. Let x x x 1 (t) and x x x 2 (t) be two solutions of the differential equation with initial conditions x x x 1 (t 0 ) and x x x 2 (t 0 ), respectively. Then for t > t 0 , being c 1 and c 2 two positive constants.
Proof: This is a simple application of Theorem 9.1 from [38] for the unperturbed case (g g g(x x x) = 0 in equation 9.6 of [38]), when, in addition, the system is autonomous.
where c 3 is a positive constant and ε 1 is a small positive constant. Suppose also that the solutions of the system are globally uniformly ultimately bounded (GUUB) with bound for some positive constant c 4 and a function β of class KL, i.e. β is a non decreasing function of its first argument, and decreasing on the second argument, such that β → 0 as t → ∞.
Proof: The solutions of the system are GUUB with bound c 3 so, for each initial condition there exists a function Thus, before time t s , as a p ≤ a m , we know that x x x lies in the region where f f f is Lipschiz, and following the same arguments of the proof of Lemma 4.6 of [38], we obtain that x x x(t) ≤ β 2 ( x x x(t s ) , t − t s ) + c 4 a m for t ≥ t s and β 2 of class KL. Therefore, all the vanishing terms can be bounded by a single KL function β and the condition of the theorem holds for all t ≥ t 0 .
Lemma 3: , v e , r e ] be the equilibrium solution of the system for a p = a 0 > 0, then u e , r e > 0, v e < 0 and the Jacobian df f f dν ν ν is continuous and Lipschiz in a neighbourhood of ν ν ν e .
Proof: Due to the presence of absolute values in the drag terms D u , D v and D r on model (4)-(6), the Jacobian matrix of the system is not differentiable when u, v or r are zero. In the rest of domain the system is smooth. Let's prove by contradiction that for a 0 > 0, the equilibrium solutions are such that u e = 0, v e = 0 and r e = 0.
Suppose that u e = 0, then (8) · v e = 0 and thus, v e = 0. It implies that, given (7), F p (a 0 ) = 0 and a 0 = 0, which is a contradiction, and demonstrates that u e = 0. Now suppose that v e = 0. Then, (8) becomes m u ·u e ·r e = 0, which implies that r e = 0 because it is already known that u e = 0. However, given (9), L · F p (a 0 ) = 0, which is also a contradiction, and then v e = 0.
Finally, suppose that r e = 0. Given (8), it yields v e = 0, which is again a contradiction and r e = 0.
To check that the signs of the speeds are correct, consider the linealization of (7) and (9) around the equilibrium point ν ν ν e : |N r | · r e = L · F p (a 0 ) Since F p > 0, then u e and r e exist and are positive for small enough a 0 , as shown by the inverse function theorem (see [39], theorem 7-5). Equation (8) yields D v (v e ) · v e = −m u · u e · r e < 0 → v e < 0, so the signs are correct. In addition, as the speeds are only zero for a 0 = 0, they must remain with the same sign for a 0 > 0.
Lemma 4: Consider the systemν ν ν = f f f (ν ν ν, a p ) defined by equations (1)- (6). There exists an a m such that if 0 ≤ a p ≤ a m , then the equilibrium point ν ν ν e for a p = a 0 is globally exponentially stable, i.e., there exist positive constants c 1 and c 2 such that ν ν ν(t)−ν ν ν e < c 1 e −c 2 (t−t 0 ) ν ν ν(t 0 )−ν ν ν e for t > t 0 . Moreover c 1 e −πc 2 re < 1. Proof: Consider the Lyapunov function V = 1 2 (m u u 2 + m v v 2 + m r r 2 ) which represents the kinetic energy of the vehicle and the surrounding water. Then, c 5 ν ν ν 2 ≤ V ≤ c 6 ν ν ν 2 where c 5 = 1 2 min(m u , m v , m r ) and c 6 = 1 2 max(m u , m v , m r ). The derivative of V becomes: From theorem 4.10 of [38], the system (4)-(6) is globally exponentially stable when a p = 0. Furthermore, if a p is bounded, i.e., 0 ≤ a p ≤ a m , then (14) is negative if ν ν ν ≥ (1+L)F p (a m ) c 7 and c 7 = min(|X u |, |Y v |, |N r |). Therefore, the system is Input to State Stable (ISS) with respect to a p by theorem 4.19 of [38]. In addition, since ∂V ∂ν ν ν ≤ 2c 6 ν ν ν , and considering the special case of γ (t) = 0 and δ(t) = (1 + L)F p (a m ) in Lemma 9.4 of [38], we obtain: This proves that the convergence to a neighbourhood of the origin is exponential, so letting a p = a 0 small enough, the linealization of the system could be as close as desired to the linealization around the origin (since the Jacobian matrix is continuous). Then, the trajectory converges exponentially to the attractive region of the linear system, and the equilibrium point is globally exponentially stable. Lemma 5: Consider system (1)-(6) under control action (11), then it is possible to choose a 0 and a such that the velocities of the system converge to an exponentially stable limit cycle of sustained oscillation and, in addition the average velocity of the system also converges exponentially to a stable fixed value [x,ȳ] T → [V xf , V yf ] T = 0. Proof: As shown by Lemma 4, given a control action a p = a 0 and a bounded disturbance a small enough so that 0 < a 0 − a < a 0 < a 0 + a < a m , then, the system (4)-(6) has globally exponentially stable equilibrium points at for a p = a 0 , a p = a 0 + a and a p = a 0 − a, respectively. In addition, all the three equilibrium points satisfy c 1 e −πc 2 re < 1. Lemma 3 shows that the solutions are GUUB and the system is locally Lipschiz in any bounded region, thus, Lemma 2 implies that ν ν ν(t) will converge to a neighbourhood of ν ν ν e than can be made arbitrary small. Therefore, without loose of generality, we can start the analysis at time t 0 such that ν ν ν − ν ν ν e ≤ c 4 a, where c 4 is the positive constant of Lemma 2.
The yaw rate equation can be expressed as r(t) = r ± e + r t , where r t is a transitory function that tends to zero when a p is constant. Using the exponential convergence, and the fact VOLUME 10, 2022 that ν ν ν −ν ν ν e ≤ c 4 a, it yields |r t | ≤ c 1 c 4 a, and therefore, 0 ≤ r min ≤ r(t) ≤ r max where r min = min(r + e , r − e ) − c 1 c 4 a and r max = max(r + e , r − e ) + c 1 c 4 a, being both positive if a is small enough.
Since r(t) is bounded and positive, then ψ(t) is strictly increasing and there exists a sequence of times t n such that ψ(t n ) = nπ, for n = 1, 2, . . . . These are precisely the times at which the control action a p switches from a 0 − a to a 0 + a when n is even, or from a 0 + a to a 0 − a when n is odd. Now, consider the trajectory that starts at ν ν ν n = ν ν ν(t n ), being t n an even switching time. During some time the control action a p = a 0 + a will remain constant, and the angle ψ will increase from ψ(t n ) = nπ to ψ(t n+1 ) = (n + 1)π, so: where t n is the time interval between two consecutive switches of the control action, i.e., t n = t n+1 − t n , and r t = r − r + e , which implies Taking into account the exponential convergence At switching time t n+1 the system reaches the state ν ν ν n+1 . Consider now another trajectory that starts at ν ν ν n at the same starting time t n and ends on state ν ν ν n+1 = ν ν ν (t n+1 ) at a slightly different switching time t n+1 , then: As (15) also holds for t n+1 , then by subtracting both Eqs. (15) for t n+1 and t n+1 , and assuming without loss of generality that t n+1 ≥ t n+1 (on the contrary, just interchange the roles between ν ν ν and ν ν ν ), we obtain: Finally, by the mean value theorem (see [40] Ch.11 Theorem 4), = f f f (ν ν ν(t), a 0 + a) for some t between t n+1 and t n+1 . Furthermore, as f f f is locally Lipschiz with constant c 8 , Thus ν ν ν (t n+1 )−ν ν ν (t n+1 ) ≤ |t n+1 −t n+1 |c 8 c 4 a, and, considering again Lemma 1 with (16) and (17): A similar analysis can be carried out for the transition from t n+1 to t n+2 , so finally: where α 2 = c 1 e −c 2 t n+1 + 2c 1 c 4 c 8 a c 2 r min . Note that the 2 factor in the term 2c 1 c 4 c 8 a c 2 r min is consequence of the fact that in this case both |t n+1 − t n+1 | and |t n+2 − t n+2 | need to be bounded, as in (17). The constant α in (19) can be made less than one by selecting a small enough, i.e. if a → 0, then α → (19) shows that the transition from two consecutive even switching times t 2n → t 2n+2 is an exponential contraction mapping from the set ν ν ν − ν ν ν e < c 4 a to itself, and thus a fixed point ν ν ν * exists such that if ν ν ν 2n = ν ν ν * (t 2n ) then ν ν ν 2n+2 = ν ν ν 2n by the Contraction Mapping theorem (Theorem B.1 of [38]).
To conclude the proof, the average velocity can be computed following the same procedure of (10): where the change of variables ψ * (t) = ψ, dψ = r * (ψ)dt has been applied. To verify that the average velocity can be chosen to be different from zero with a proper selection of a 0 consider the limit case in which a 0 is very small and a a 0 . On the one hand, from (7) and (9) and expanding them up to the second order in a 0 and a, we obtain u ± e ≈ 1 |X u | F p (a 0 ± a) and r ± e ≈ L |N r | F p (a 0 ± a). Thus, u ± e r ± e = |N r | |X u |L , which implies that u * (ψ) r * (ψ) is constant (up to this order of approximation) and thus, the integral over a period of On the other hand, from (8), This implies that v * (ψ) r * (ψ) is a function that converges to two different values at ψ = 0 and ψ = π. Furthermore, the convergence to the steady state is very fast since the linearization of (4) for small enough velocities is m uu + |X u |u = F p (a p ), thus u * (t) converges exponentially to u ± e with rate |X u | m u . Therefore, u * (ψ) converges exponentially fast to u ± e with, at least, the rate |X u | r min m u , which is very large for small enough a 0 . The same applies to r * (t), thus u * (ψ) r * (ψ) could be replaced in the integral by its steady state value The same reasoning applies toȳ in (21), then,ȳ = 0, and [V xf , V yf ] = 0.
Remark 1: Demonstration of Lemma 5 is complex because the control action (11) is discontinuous and it switches depending on the state of the system. Hence, it is not easy or straightforward to prove the existence of an attracting limit cycle. However, despite this complex demonstration, it is very important to highlight the existence of such limit cycle and that the control action (11) provides satisfactory results, being very simple and easy to apply in practice in case of a thruster failure.
Remark 2: Demonstration of Lemma 5 may suggest that in order to achieve exponential convergence to a sustained advance velocity, a very small a 0 and a are needed. However, as it will be shown in the simulation results of Section IV, control action (11) provides stable and good behaviour even for large values of a 0 and a, with a < a 0 . In fact, in a practical situation, it is usually convenient to select a as large as possible without loosing stability, and such that a 0 − a is also large enough. The latter ensures that the vehicle turns at a reasonable rate, the trajectory reaches a good average velocity, and it is able to reject disturbances.
Once the control action (11), which is able to move the vehicle in ''some'' direction ψ, has been defined, it is straightforward to modify it to move the vehicle along a predefined desired direction ψ r : where ψ is the direction of the average velocity when (11) is applied, i.e. ψ = atan2(V yf , V xf ). Therefore, control law (22) switches when ψ = ψ r − ψ + 2nπ and ψ = ψ r − ψ + (2n + 1)π, as shown in Figure 4, and the vehicle moves in direction ψ r instead of moving in direction ψ with respect to the X-axis. Doing this way, the trajectory obtained with (22) is rotated an angle ψ r − ψ with respect to the trajectory obtained with (11), and the final average velocity points towards ψ + ψ r − ψ = ψ r .
Theorem 1: Consider system (1)- (6) and control action (22), then a 0 , a and ψ can be chosen such that the average speed of the vehicle converges exponentially to a constant speed V f > 0 in the direction of ψ r .
Proof: Consider the following change of coordinates, which consists of a rotation of the angle ρ = ψ − ψ r of the inertial coordinate frame: x = cos(ρ)x − sin(ρ)y y = sin(ρ)x + cos(ρ)y ψ = ψ + ρ Differentiating the above equations with respect to time and combining them with (1)-(3), it yieldṡ Then, the dynamic equations (4-6) of the system are invariant under the rotation since they do not depend on the angle ψ but only on its derivative, which remains unchanged. The control action now is a p = a 0 + a · sign(sin(ψ − ρ + ψ − ψ r )) = a 0 + a · sign(sin(ψ )). Thus, these are the conditions of Lemma 5 and [ VOLUME 10, 2022 Finally, returning to the original inertial coordinate frame: Remark 3: Computing ψ from its definition in terms of the average speeds (20)-(21) is a complex task. However, it can be computed (or measured experimentally) by simply applying the control action (11) in simulation (or in a real test) during enough time to remove the transient, and then, computing it directly from the average speeds as shown in Figure 4. Furthermore, this means that control law (22) can be applied even if a precise model of the vehicle is not available by directly measuring ψ with a simple experiment.

B. POSITION CONTROL
Once the average velocity of the vehicle can be controlled, a control law that allows the vehicle to reach the neighbourhood of a desired reference position X X X r = [x r , y r ] T must be defined. The direction of the vector that goes from the vehicle position X X X = [x, y] T to the target point X X X r is used as reference for the orientation of the average advance velocity: a p = a 0 + a · sign(sin(ψ + ψ − ψ r (x, y))) (24) This strategy is depicted in Figure 5. While the vehicle makes turns, the control action will switch when ψ = nπ + ψ r (x, y) − ψ with n = 1, 2, . . . , and the trajectory will approach the target point. To prove the previous statement, it is important to notice that ψ r changes as the vehicle moves. In fact, ψ r is, in general, different for each control switch, so Theorem 1 cannot be applied directly. This difficulty is addressed by Lemma 6, which essentially ensures that, for bounded variation of ψ r over two consecutive switches of the control law, the movement of the vehicle is only sightly disturbed, and the disturbance grows linearly with the variation of ψ r . Lemma 6: Consider the system (1)- (6) under the piecewise constant control action a p = a 0 + a · sign(sin(ψ(t) + ψ − ψ r (t))), where a 0 and a are selected according to Theorem 1. Define t n as the times where the control action switches, i.e. when ψ(t n ) + ψ − ψ r (t n ) = nπ. Define also δ n = ψ r (t n+1 ) − ψ r (t n ) as the difference between ψ r in two consecutive switches of the control law, and also

as the distance covered after two consecutive control switches.
Assume that infinitely many switching times t n , n = 1, 2 . . . ∞ exist, and that δ n is bounded 0 ≤ |δ n | < δ m for some positive constant δ m . Then, [ x n , y n ] T converges exponentially to [ x n , y n ] T = T V f [cos(ψ r (t)), sin(ψ r (t))] T +γ γ γ (t) as n → ∞, where T is the period of the unperturbed trajectory (δ m = 0) and γ γ γ (t) is a bounded function such that, before some transient time t s , γ γ γ (t) ≤ c 9 δ m for some positive constant c 9 .
Proof: Consider first the system (1)-(6) under the control action a p = a 0 + a · sign(sin(ψ(t) + g(t))) where g(t) is an arbitrary function of time such that |g(t n )| ≤ g m for some positive constant g m . Following the same reasoning of Lemma 5 and considering the new term g(t n ), (15) now yields: as ψ(t) must go from ψ(t n ) = nπ−g(t n ) to ψ(t n+1 ) = (n + 1)π − g(t n+1 ) for the control action to switch. Therefore, the trajectory of the velocity can be seen as a perturbed version of the fixed point trajectory ν ν ν * of Lemma 5. It is important to remark that the only effect of g(t) is that it modifies the times at which the control action switches, so only g(t n ), n = 1, 2, . . . , are important for the analysis. Further, if δ m = 0, the term g(t) will have no effect, and the trajectory converges to the periodic (with period T ) solution of Lemma 5.
Consider the fixed point velocity ν ν ν * n and the perturbed velocity ν ν ν n starting at the same time t n (i.e. t n = t * n ). Then, the subsequent switching times will be different. This difference can be obtained by considering the additional term g(t n+1 ) − g(t n ) into (17): Including this new term δ m on (18) yields Then, a modified version of (19) can be computed: r min , and after k cycles: α i c 10 aδ m ≤ α k ν ν ν n − ν ν ν * n + c 10 1 − α aδ m Therefore, the difference between ν ν ν n and ν ν ν * n is globally ultimately bounded, its norm converges exponentially fast to the bound, and there exists a k large enough such that the first term becomes smaller than c 10 1−α aδ m . Naming this time t s and due to the continuity of the solutions on the finite interval, then ν ν ν(t) − ν ν ν * (t) ≤ c 10 aδ m for t s ≤ t n ≤ t ≤ t n + T , where c 10 is a positive constant.
The displacement x n = x(t n+2 ) − x(t n ) over the X-coordinate during this time interval is given by: Rewriting the first term as u cos(ψ) = u * cos( ψ) + u * (cos(ψ) − cos( ψ)) + (u − u * ) cos(ψ), we may write: where V max is the maximum advance speed that the vehicle can reach. The same argument applies for the second term v sin(ψ) of (26). Consider now the difference between the periods of the perturbed trajectory and the unperturbed one, i.e., t n+2 − (t n + T ) = (t n+2 − t n ) − T . Then, considering that − t * n , and doing the same with t n+2 − t n , yields: Since the second term of the right hand side of (27) is |, and the first term is the translation of the second one a step further (n → n + 1), both terms can be bounded using (25).
In addition, the first integral in (26) applied to the fixed point trajectory ν ν ν * is given by T V xf (see (20)), so x n may be written as x n = T V xf + ξ x . Then: where each term of the first integral in (28) is bounded by (V max + 1)c 10 aδ m . The term of the second integral is bounded because |u cos(ψ) − v sin(ψ)| ≤ V max , and the time differences are bounded by (27). The same applies to the displacement over the Y-axis, which yields y n = T V yf + ξ y , where |ξ y | ≤ c 9 δ m .
Proof: The vehicle turns monotonically infinitely as time passes because the angular speed of the vehicle is lower bounded by r min . Define X X X n = [x n , y n ] T as the position of the vehicle at time t n , X X X r = [x r , y r ] T as the target point, and d n = X X X n −X X X r as the distance from the vehicle to the target point at t n , as shown in Figure 5.
Then, there are two possibilities: 1) There exists a time t end such that the control law stops switching for t > t end and remains constant. In this case the trajectory converges to a circumference, as seen in Section III-A. Furthermore, since the control law must remain constant, then sin(ψ + ψ − ψ r (x, y))) is bounded and different from zero. This can only happen if the trajectory encircles the target point. Notice also that the diameter of the trajectory over one turn (i.e the maximum distance between two points of the trajectory over one revolution) is bounded since the vehicle completes a turn in at most 2π r min seconds, and, during this time, it can travel a maximum distance of D max = 2πV max r min . Therefore, the vehicle's trajectory encircles X X X r , the diameter is bounded by D max , and thus, the distance from the vehicle to the target point is globally finally bounded by D max .
2) There exists an infinite sequence of times t n , n = 1, 2, . . . , at which the control law switches. Define d n = d(t n ) as the distance from the vehicle to the target point at switching time t n , D n = X X X (t n+1 ) − X X X (t n ) as the distance between two consecutive switching times, δ n = ψ r (t n+1 ) − ψ r (t n ) as the difference between the orientation angles to the target point at two consecutive switching times (see Lemma 6), and β n as the angle shown in Figure 5.
The triangle formed by d n , d n+1 and D n in Figure 5 is such that D n sin(|β n |) = d n sin(|δ n |), and as D n ≤ D max , then: The position X X X n of the vehicle at time t n is: Then, define d lim as follows: where 0 < α < 1 is the constant from (19), which can be chosen such that sin is not zero. Then if d n ≥ d lim , then d n ≥ T V f , d n ≥ D max and αT V f ≥ c 9 · arcsin D max d n , so: Then, If d n ≥ d lim the conditions of Lemma 6 applies with δ m = αT V f c 9 , and, for large enough n, the new position X X X n+1 at time t n+1 > t s becomes: where γ γ γ (t) ≤ c 9 δ m = c 9 · αT V f c 9 = αT V f . Therefore, in this case, the new distance d n+1 after one turn is: The last equation implies that the distance d n decreases monotonically until it reaches d lim , and, once d n < d lim the distance can grow. However, if d n > d lim , then the conditions of Lemma 6 apply again and before some time t s the distance will decrease. Therefore, d n is globally finitely bounded by d lim + V max t s . Furthermore, the distance from the vehicle to the target point is d = X X X −X X X r ≤ X X X −X X X n + X X X n −X X X r ≤ D max + d lim + V max t s , which is GUUB.
Remark 4: It is important to highlight again the simplicity and elegance of the control law (23)- (24), which can be obscured by the complexity of the proofs. In addition, it is essential to keep in mind that this control law is able to drive the vehicle to the neighbourhood of any desired target point by only using one motor and two control values a p = a 0 ± a, which is the absolute bare minimum to be able to control the vehicle.

IV. RESULTS
The well performance and behaviour of the control law  Table 2. These parameters have been obtained experimentally from previous tests carried out with the MEDUSA-AUV shown in Figure 1.

A. EXAMPLE 1: MOVEMENT IN A ARBITRARY DIRECTION
In this first example, the aim is to show how, using control action (11), the vehicle moves towards a direction with a nonzero average advance velocity.
For the example, control (11) is applied considering a 0 = 45, and a = 15. Thus, the two control actions applied to the vehicle are a p = a 0 − a = 30 and a p = a 0 + a = 60,   Figure 6 shows how, after a transitory of approximately 10 seconds, the speeds of the system reach very fast a limit cycle of sustained oscillations, as Lemma 5 claims. Therefore, by keeping this switching control law constant over the appropriate time window, the vehicle will advance in a given direction. Notice how the speeds change according to the switching of the control action a p . The asymmetric switching times, which are consequence of the different angular speeds of each interval of the control law, allow the vehicle to follow, in average sense, a straight line. This fact can be checked in the position and orientation of the vehicle under control (11) depicted in Figure 7 for a time window of 300 seconds. Notice that from this figure it is possible to measure the average velocity after the transient. The average velocity components are V xf = 3cm/s, V yf = −5.9cm/s, and thus V f = 6.62cm/s. The red lines in the figure show how the estimated average velocity is computed. Moreover, it is possible to compute the angle ψ = −63 • that defines the direction towards which the vehicle moves.
The whole trajectory followed by the AUV during 300 seconds is depicted in Figure 8, from which it is also possible to determine ψ.

B. EXAMPLE 2: GUIDANCE TO A SELECTED RECOVERY POINT
In this second example, it is shown how the vehicle can be driven from any starting point to a desired target point, which is the origin of the coordinate frame for the example at hand. The same parameters of Example 1 for the control law are used. Thus, ψ = −63 • is already known and control action (23)-(24) can be applied. Figure 9 shows the trajectories followed by the AUV starting from 4 different initial conditions. These initial conditions X X X 0 , ν ν ν 0 = [u 0 , v 0 , t 0 ] T and ψ 0 are shown in Table 3. As mentioned, for all the cases the reference target point is X X X r = [0, 0]. Figure 9 shows that, starting from any initial condition, the trajectories of the AUV converge to a neighbourhood of the origin following a ''straight'' (in an average sense) path and eventually the trajectory encircles the reference point and the control law stops switching.

C. EXAMPLE 3: ROBUSTNESS TO UNCERTAINTY ON THE ANGLE ψ
In this third example, it is shown that an accurate estimation of the parameters, such as ψ, are not needed for a correct behaviour of the control law.
The robustness of the controller to changes in the parameters can be easily illustrated by selecting the angle ψ as a very rough estimate of its true value. In this sense, and considering the same initial conditions of Example 2 shown in Table 3, the angleˆ ψ = −90 • is selected, almost 30 • far from the optimal one, which was ψ = −63 • .   Table 3. Figure 10 shows that, although the trajectories are not as ''straight'' as in Example 2, they finally encircle the reference point.

D. EXAMPLE 4: ROBUSTNESS TO CHANGES IN THE CONTROL LAW PARAMETERS
Theoretical results could suggest that, in order to obtain convergence to a neighbourhood of the target point, a careful selection of the control parameters is needed. On the contrary, the control is very robust to changes in all the control parameters. To illustrate it, Figure 11 depicts the trajectories of the system starting from initial conditions X X X 0 = [−20, 20] ν ν ν 0 = [0, 0, 0] T , ψ 0 = 0, and for a 0 = 20, 50, 66 and a = a 0 2 , so that a ≤ a 0 . These values cover the full  range of the actuator since a p = a 0 ± a ranges from 10% to 99% of thrust. In addition, for all casesˆ ψ = −90 • was selected. This selected value for ψ is not optimal for any of the a 0 and a mentioned above, but it is enough so that the vehicle does not take so much time to reach the target point. As Figure 11 shows, the control law works well in practice even for not optimal parameters, and in all cases the trajectories finally encircle the target point. Notice that a 0 = 20 is a very bad selection of the control parameters, as the blue trajectory shows. It requires more time to reach the final trajectory that encircles the target point, and also the distance at which the vehicle keeps turning is larger compared with other controllers. However, despite this bad parameter selection, the vehicle can reach the neighbourhood of the target point and keep moving in a stable circular trajectory. To clearly see the behaviour of each of the controllers for their corresponding values of a 0 , Figure 12 shows the speeds of the AUV. Notice that for a 0 = 66 the trajectory encircles the target point at t = 415s and the control law stops switching. For a 0 = 50 the trajectory encircles the target point at t = 367s but it has a transitory before control law stops switching at t = 392s. Finally, for a 0 = 20 the trajectory takes more time to reach the target point, which is encircled at t = 671s, and the transitory is also longer since it gets trapped in a slowly diverging (but ultimately bounded) trajectory such that sin(ψ + ψ − ψ r ) ≈ 0. This condition is pathological (and a consequence of a very bad choice of the control parameters) because it causes fast switching of the controller during the time interval 672s ≤ t ≤ 1812s until it eventually stops and the AUV keeps moving on a circular trajectory around the target point. This example illustrates that the behaviour of the vehicle, once the trajectory is close to the target point, could be complex if the parameters selection is done arbitrarily, randomly or wrongly.

E. EXAMPLE 5: ROBUSTNESS TO DISTURBANCES
In this final example, the robustness of the controller to external disturbances is studied. In order to do so a constant current V c is introduced in the direction of the Y −axis by replacing (2) byẏ = u sin(ψ) + v cos(ψ) + V c . Since the average velocity that the vehicle can reach is limited (and usually small compared with the velocity that can be reached in normal operation) the current speed must be less than V f so that the AUV can overcome it. Thus, the controller is tested for different current speeds, V c = 0.1V f , 0.4V f , 0.7V f and 0.9V f , with the AUV starting from initial conditions X X X 0 = [−20, 20], ν ν ν 0 = [0, 0, 0] T and ψ 0 = 0. Figure 13 shows the good performance that is obtained for moderate disturbances (up to 70% of the maximum speed of the AUV). However, when the disturbance is close to V f , the performance degrades rapidly, as expected. In fact, as the disturbance increases, the average velocity of the vehicle towards the target point decreases. In the limit case that V c = V f the vehicle is not able to even approach the target point.

V. CONCLUSION
In this work a fault-tolerant control to be used in emergency situations has been designed. This control law allows to drive the vehicle to a selected target point, i.e, a safety recovery point, by carrying out a homing operation with a single operative thruster controlled by discrete inputs. The resultant trajectory of the vehicle describes a spiral-like path due to the use of a single thruster. However, by adequately controlling the switching times of the control law, it is possible to drive the vehicle, in average, towards the desired direction and keep it in the neighbourhood of the target point. The stability and convergence of the control law have been analytically demonstrated, and its good performance has been shown in several examples. In these examples, it has been demonstrated how the control law is able to drive and keep the AUV within the neighbourhood of the desired target point even when there is uncertainty in the parameters, when the parameters are not optimal, and under constant disturbances.
Future work will aim to test the well performance of the control law in an experimental setup and to develop new controllers for the same faulty situation in which only one thruster is available, but trying to optimize different parameters, such as energy consumption, time, speed, etc. This will allow to compare different controllers for similar situations, and to be able to determine if, in a situation of emergency in which only one thruster keeps working, it is better to use an optimized controller (for a given parameter) or a simple and general controller, as the one proposed in the present paper, is enough and more appropriate for this urgent action. He is the author or the coauthor of more than 120 publications, including books chapters, articles in journals, and conference proceedings. His scientific interests include control engineering field: controller design, robust control, computer control, modeling and simulation, and application of control and simulation to high speed craft, marine systems, airplanes, and robotics. VOLUME 10, 2022