Skip to Main Content
Automatic perception of human posture and gesture from vision input has an important role in developing intelligent video systems. In this paper, we present a novel gesture recognition approach for human computer interactivity based on marker-less upper body pose tracking in 3-D with multiple cameras. To achieve the robustness and real-time performance required for practical applications, the idea is to break the exponentially large search problem of upper body pose into two steps: first, the 3-D movements of upper body extremities (i.e., head and hands) are tracked. Then using knowledge of upper body model constraints, these extremities movements are used to infer the whole 3-D upper body motion as an inverse kinematics problem. Since the head and hand regions are typically well defined and undergo less occlusion, tracking is more reliable and could enable more robust upper body pose determination. Moreover, by breaking the problem of upper body pose tracking into two steps, the complexity is reduced considerably. Using pose tracking output, the gesture recognition is then done based on longest common subsequence similarity measurement of upper body joint angles dynamics. In our experiment, we provide an extensive validation of the proposed upper body pose tracking from 3-D extremity movement which showed good results with various subjects in different environments. Regarding the gesture recognition based on joint angles dynamics, our experimental evaluation of five subjects doing six upper body gestures with average classification accuracies over 90% indicates the promise and feasibility of the proposed system.