Skip to Main Content
This paper addresses tracking and 3D pose estimation of human faces with large pose and expression changes in video sequences obtained from an un-calibrated monocular camera. The classical pose estimation methods suffer from two disadvantages: (1) a 3D head model or a reference frame is always needed and the camera should be calibrated in advance; (2) it is difficult to deal with non-rigid motion, which is very common for human faces. In this paper, we present a pose estimation system, which is able to overcome the above disadvantages. For each frame, a 2D active appearance model is adopted to reliably track the face and facial features with large pose and expression variations. Then we utilize a recently developed non-rigid structure from motion (SFM) technique to recover the 3D face shape. Instead of direct using the rotation matrix resulted from SFM, we propose a method to use robust statistics and 3D-2D feature point correspondence to accurately recover the 3D head pose. Our experiments have demonstrated the effectiveness and efficiency of the approach.