Skip to Main Content
A novel stereo vision system for real-time human detection and tracking on a mobile service robot is presented in this paper. The system integrates the individually enhanced stereo-based human detection, HOG-based human detection, color-based tracking, and motion estimation for the robust detection and tracking of humans with large appearance and scale variations in real-world environments. A new framework of maximum likelihood based multi-model fusion is proposed to fuse these four human detection and tracking models according to the detection-track associations in 3D space, which is robust to the possible missed detections, false detections, and duplicated responses from the individual models. Multi-person tracking is implemented in a sequential near-to-far way, which well alleviates the difficulties caused by human-over-human occlusions. Extensive experimental results demonstrate the robustness of the proposed system under real-world scenarios with large variations in lighting conditions, cluttered backgrounds, human clothes and postures, and complex occlusion situations. Significant improvements in human detection and tracking have been achieved. The system has been deployed on six robot butlers to serve drinks, and showed encouraging performance in open ceremony events.