Abstract:
Understanding and predicting human behaviors accurately are essential prerequisites for effective human-robot interaction. Recently, there has been growing interest in mu...Show MoreMetadata
Abstract:
Understanding and predicting human behaviors accurately are essential prerequisites for effective human-robot interaction. Recently, there has been growing interest in multi-sensor fusion for creating robust and dependable robotic platforms, especially in outdoor settings. However, majority of current computer vision models focus on a single modality, such as LiDAR point cloud data or RGB images, and often capture only one person in each scene. This limited approach significantly restricts the effective use of all the available data in robotics. In this study, we propose utilizing multi-sensor fusion to enhance human action detection and motion prediction by incorporating 3D pose and motion information. This approach leverages robust human motion tracking and action detection, addressing issues like inaccurate human localization and matching ambiguity commonly found in single-camera view RGB videos of outdoor multi-person scenes. Our method demonstrates high performance on the publicly available Human-M3 dataset, showcasing the potential of applying multi-sensor multi-task models in real-world robotics scenarios.
Date of Conference: 24-26 April 2024
Date Added to IEEE Xplore: 09 May 2024
ISBN Information: