I. Introduction
With the growing interest and development in deep rein-forcement learning (DRL), we have seen remarkable progresses and achievements of DRL in various application scenarios, including video games [1], autonomous vehicles [2], and robotics control [3]–[5]. Despite the huge success, existing DRL-based methods are still quite limited on generalization [6], which restricts real applications of DRL methods, especially in robot control tasks. The policy highly depends on the parameter settings of the task, thus it can only learns how to control a single robot in a single environment at one time. For example, given robot control tasks with different robot hardware implementations (link length, etc.) and environment features (friction coefficient, etc.), we are required to train multiple policies, despite the similarity among these tasks, which is quite computationally expensive and time-consuming. Therefore, it is essential to design a unified DRL method, which can be utilized across various robots and environments simultaneously.
The visualization of the generalization task and GNN -embedding composition. The task pool k contains robots with different hardware parameters and environments with various characteristics. Each task is a combination of a robot and an environment. Gnn -embedding is a unified policy to control multiple tasks, which is able to learn both the task-invariant knowledge and the task-specific knowledge.