I. Introduction
Soft robots represent a paradigm shift in the field of robotics due to the inherent flexibility integrated into their links and joints [1], [2]. This unique characteristic empowers robotics as a whole as it allows the automation of novel tasks, which would otherwise be unsuitable for traditional, rigid robots [3]. However, challenges that come with soft manufacturing are exhibited in difficulties in modeling and control due to the introduced nonlinearities, large deformations, and infinite degrees of freedom [4], [5]. Additionally, while the low manufacturing cost makes these robots accessible, the aforementioned issues are further pronounced by low repeatability. The community still requires reliable control methods for soft robots to allow contributions that can extend their impact to fields outside of soft robotics [6]. For example, soft robots designed for surgical applications, such as Stiff-flop [7], could greatly benefit from novel control strategies that can compensate for their low positional accuracy [8]. Due to these potential impacts, we
Top: Stiff-flop setup. Bottom Left: view of the end effector. Bottom Right: the pattern that characterizes the camera view. The pattern is placed at a distance of 15 cm from the camera.
aim to address the problem of visual servoing using an eye-in-hand Stiff-flop robot (Fig. 1). Soft robot modeling and control approaches based on learning have shown to be a convenient and effective way of tackling the soft robot control problem by capturing its highly dynamic behavior [9], with approaches spanning from reinforcement learning [10], [11] to deep learning [12], [13], continual learning [14] and imitation learning [15]. Therefore in this work, we propose to estimate the mapping between actuation space and camera view with a learning-based method.