Abstract:
Imitation Learning (IL) algorithms such as behavior cloning are a promising direction for learning human-level driving behavior. However, these approaches do not explicit...Show MoreMetadata
Abstract:
Imitation Learning (IL) algorithms such as behavior cloning are a promising direction for learning human-level driving behavior. However, these approaches do not explicitly infer the underlying causal structure of the learned task. This often leads to misattribution about the relative importance of scene elements towards the occurrence of a corresponding action, a phenomenon termed causal confusion or causal misattribution. Causal confusion is made worse in highly complex scenarios such as urban driving, where the agent has access to a large amount of information per time step (visual data, sensor data, odometry, etc.). Our key idea is that while driving, human drivers naturally exhibit an easily obtained, continuous signal that is highly correlated with causal elements of the state space: eye gaze. We collect human driver demonstrations in a CARLA-based VR driving simulator, DReyeVR, allowing us to capture eye gaze in the same simulation environment commonly used in prior work. Further, we propose a contrastive learning method to use gaze-based supervision to mitigate causal confusion in driving IL agents — exploiting the relative importance of gazed-at and not-gazed-at scene elements for driving decision-making. We present quantitative results demonstrating the promise of gaze-based supervision improving the driving performance of IL agents.
Published in: 2024 IEEE Intelligent Vehicles Symposium (IV)
Date of Conference: 02-05 June 2024
Date Added to IEEE Xplore: 15 July 2024
ISBN Information: