By Topic

Structure context of local features in realistic human action recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

6 Author(s)
Qiuxia Wu ; College of Automation Science and Engineering, South China University of Technology, Guangzhou, China ; Shiyang Lu ; Zhiyong Wang ; Feiqi Deng
more authors

Realistic human action recognition has been emerging as a challenging research topic due to the difficulties of representing different human actions in diverse realistic scenes. In the bag-of-features model, human actions are generally represented with the distribution of local features derived from the keypoints of action videos. Various local features have been proposed to characterize those key points. However, the important structural information among the key points has not been well investigated yet. In this paper, we propose to characterize such structure information with shape context. Therefore, each keypoint is characterized with both its local visual attributes and its global structural context contributed by other keypoints. The bag-of-features model is utilized for representing each human action and SVM is employed to perform human action recognition. Experimental results on the challenging YouTube dataset and HOHA-2 dataset demonstrate that our proposed approach accounting for structural information is more effective in representing realistic human actions. In addition, we also investigate the impact of choosing different local features such as SIFT, HOG, and HOF descriptors in human action representation. It is observed that dense keypoints can better exploit the advantages of our proposed approach.

Published in:

Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on

Date of Conference:

6-13 Nov. 2011