Loading [a11y]/accessibility-menu.js
SOZIL: Self-Optimal Zero-Shot Imitation Learning | IEEE Journals & Magazine | IEEE Xplore

Abstract:

Zero-shot imitation learning has demonstrated its superiority to learn complex robotic tasks with less human participation. Recent studies show convincing performance und...Show More

Abstract:

Zero-shot imitation learning has demonstrated its superiority to learn complex robotic tasks with less human participation. Recent studies show convincing performance under the condition that the robot follows the demonstration strictly by the learned inverse model. However, these methods are difficult to achieve satisfactory performance in imitation when the demonstration is suboptimal, and the learning of the learned inverse models is vulnerable to label ambiguity issues. In this article, we propose self-optimal zero-shot imitation learning (SOZIL) to tackle these problems. The contribution of SOZIL is twofold. First, goal consistency loss (GCL) is designed to learn the multistep goal-conditioned policy from exploration data. By directly using the goal state as supervision, GCL solves the label ambiguity problem caused by trajectory and action diversity. Second, estimation-based keyframe extraction (EKE) is developed to optimize demonstrations. We formulate the keyframe extraction process as a path optimization problem under suboptimal control. By predicting the performance of the learned policy in executing transitions of any two states, EKE creates a directed graph containing all candidate paths and extracts keyframes by solving the graph’s shortest path problem. Furthermore, the proposed method is evaluated with various simulated and real-world robotic manipulating experiments, such as cable harness assembly, rope manipulation, and block moving. Experimental results show that SOZIL achieves a superior success rate and manipulation efficiency than baselines.
Published in: IEEE Transactions on Cognitive and Developmental Systems ( Volume: 15, Issue: 4, December 2023)
Page(s): 2077 - 2088
Date of Publication: 30 September 2021

ISSN Information:

Funding Agency:


I. Introduction

Imitation learning is a powerful paradigm for robots to learn how to perform tasks by imitating an expert’s behaviors [1], [2]. Previous studies [3], [4] required experts to provide one or more demonstrations in the form of state-action pairs, and the robot learns task-related policies from the demonstrations. However, acquiring action information usually requires additional sensors [5], [6], action recognition modules [7], and human participation. Even in some cases, robots cannot access expert actions, such as tutorial videos on YouTube [8]. The reliance on expert actions increases the cost of deploying imitation learning methods on real robots. Thus, it would be very beneficial to devise imitation learning algorithms that do not need expert actions. Zero-shot imitation learning aims to endow robots with the ability to imitate behavior from observation sequences. The zero-shot means the robot never has access to expert actions, neither during training nor for task demonstration at inference [9]. Zero-shot imitation is a meaningful setting in robotic applications, as it enables users to easily teach robots to perform tasks without additional technology or long interaction times. These characteristics make zero-shot imitation is expected to change the dilemma that requires cumbersome programming and specialized expertise when deploying robots in small and medium-sized enterprises [10].

Contact IEEE to Subscribe

References

References is not available for this document.