Skip to Main Content
We present a completely automatic algorithm for initializing and tracking the articulated motion of humans using image sequences obtained from multiple cameras. A detailed articulated human body model composed of sixteen rigid segments that allows both translation and rotation at joints is used. Voxel data of the subject obtained from the images is segmented into the different articulated chains using Laplacian eigenmaps. The segmented chains are registered in a subset of the frames using a single-frame registration technique and subsequently used to initialize the pose in the sequence. A temporal registration method is proposed to identify the partially segmented or unregistered articulated chains in the remaining frames in the sequence. The proposed tracker uses motion cues such as pixel displacement as well as 2-D and 3-D shape cues such as silhouettes, motion residue, and skeleton curves. The tracking algorithm consists of a predictor that uses motion cues and a corrector that uses shape cues. The use of complementary cues in the tracking alleviates the twin problems of drift and convergence to local minima. The use of multiple cameras also allows us to deal with the problems due to self-occlusion and kinematic singularity. We present tracking results on sequences with different kinds of motion to illustrate the effectiveness of our approach. The pose of the subject is correctly tracked for the duration of the sequence as can be verified by inspection.