Skip to Main Content
This paper presents a novel method based on Affine-SIFT detector to capture motion for human action recognition. More specifically, we propose a new action representation based on computing a rich set of descriptors from Affine-SIFT (ASIFT) key point trajectories. Since most previous approaches to human action recognition typically focus on action classification or localization, these approaches usually ignore the information about human identity. We propose using quantized local SIFT descriptors to represent human identity. A compact yet discriminative semantics visual vocabulary was built by a Latent Topic model for high-level representation. Given a novel video sequence, our algorithm can not only categorize human actions contained in the video, but also verify the persons who perform the actions. We test our algorithm on two datasets: the KTH human motion dataset and our action dataset. Our results reflect the promise of our approach.