Skip to Main Content
Smooth control using an active vision head's verge-axis joint is performed through continuous state and action reinforcement learning. The system learns to perform visual servoing based on rewards given relative to tracking performance. The learned controller compensates for the velocity of the target and performs lag-free pursuit of a swinging target. By comparing controllers exposed to different environments we show that the controller is predicting the motion of the target by forming an implicit model of the target's motion. Experimental results are presented that demonstrate the advantages and disadvantages of implicit modelling.