I. Introduction
Robots eventually will become a part of our daily lives in human society [1]. They are no longer used to only perform the same task thousands of times; rather, they will be faced with thousands of different tasks that rarely repeat in an ever-changing environment [2]. Robot learning will be necessary to those end users without a programming ability. It will allow robots to automatically adjust to stochastic and dynamic environments by learning from demonstrations (LfD) or interacting with the environment [3]. The LfD methods are best known as robot programming by demonstration, robot learning from/by demonstration, apprenticeship learning, and imitation learning [4]. They highlight several strengths, including nonexpert robot programming, data efficiency, safe learning, performance guarantees, and platform independence, as described in [5]. As pointed out by many surveys in the context of LfD [6]–[8], determining when and where to add a new demonstration, as well as which demonstration is redundant, remains challenging.