Loading [a11y]/accessibility-menu.js
Human action recognition using labeled Latent Dirichlet Allocation model | IEEE Conference Publication | IEEE Xplore

Human action recognition using labeled Latent Dirichlet Allocation model


Abstract:

Recognition of human actions has already been an active area in the computer vision domain and techniques related to action recognition have been applied in plenty of fie...Show More

Abstract:

Recognition of human actions has already been an active area in the computer vision domain and techniques related to action recognition have been applied in plenty of fields such as smart surveillance, motion analysis and virtual reality. In this paper, we propose a new action recognition method which represents human actions as a bag of spatio-temporal words extracted from input video sequences and uses L-LDA (labeled Latent Dirichlet Allocation) model as a classifier. L-LDA is a supervised model extended from LDA which is unsupervised. The L-LDA adds a label layer on the basis of LDA to label the category of the train video sequences, so L-LDA can assign the latent topic variable in the model to the specific action categorization automatically. What's more, due to above characteristic of L-LDA, it can help to estimate the model parameters more reasonably, accurately and fast. We test our method on the KTH and Weizmann human action dataset and the experimental results show that L-LDA is better than its unsupervised counterpart LDA as well as SVMs (support vector machines).
Date of Conference: 24-26 October 2013
Date Added to IEEE Xplore: 02 December 2013
Electronic ISBN:978-1-4799-0308-5
Conference Location: Hangzhou, China
References is not available for this document.

I. Introduction

Action recognition is to represent and track human actions using computer techniques, and then infer and category actions combined with other information such as background and surrounding environment [1]. The key techniques in the field of action recognition include extracting representative visual features from video sequences, choosing appropriate feature descriptor and designing classification model with a good performance [2]. According to the above analysis, action recognition can be divided into two level tasks: (1) feature extraction and representation at the bottom; (2) model learning and action categorization at the top. We can see the flowchart of general action recognition approach in Fig.1..

Select All
1.
S. Vishwakarma and A. Agrawal, Representing feature quantization approach using spatial-temporal relation for action recognition, The Journal of Computer Science vol. 7143, 2012 pp. 98-105,Apr.2012
2.
W Daniel, R Remi, and B Edmond, A Survey of vision-based methods for action representation, segmentation and recognition, The Journal of Computer Vision and Image Understanding, vol. 115, pp. 224-241, Feb. 2011.
3.
H. Ragheb, S.Velastin, P, Remagnino and T. Ellis, Human action recognition using robust power spectrum features, in IEEE International Conference on Image Processing, pp. 753-756, October 2008.
4.
S.Danafar and N. Gheissari, Action recognition for surveillance applications using optic ow and SVM, in IEEE Asian Conference on Computer Vision, pp. 457-466, November 2007.
5.
H. Wang, K. A, Schmid C, Chenglin Liu, Action recognition by dense trajectories, in IEEE International Conference on Computer Vision and Pattern Recognition, 3169-3176, 2011.
6.
I. Laptev, M. Marszaek, C. Schmid, and B. Rozenfeld, Learning realistic human actions from movies, in IEEE International Conference on Computer Vision and Pattern Recognition, 2008.
7.
P. Dollar, V. Rabaud, G. Cottrell, S. Belongie, Behavior recognition via sparse spatio-temporal features, in IEEE Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65-72, 2005.
8.
G. Willems, T. Tuytelaars, L. V. Gool. An efficient dense and scaleinvariant spatio-temporal interest point detector, in European Conference on Computer Vision , pp. 650-663, 2008.
9.
T. Hofmann, Probabilistic latent semantic indexing, in ACM SIGIR conference on research and development in information retrieval, pp. 50-57, 1999.
10.
D. M. Blei, A. Y. Ng and M. I. Jordan, Latent dirichlet allocation, The Journal of Machine Learning Research, vol. 3, pp. 993-1022, 2003.
11.
S. Savarese, A. DelPozo, J.C. Nieblesm and L.Fei-fei. Spatial-temporal correlatons for unsupervised action classification, In IEEE Workshop on Motion and Video Computing, 1-2, 2008.
12.
J.C. Niebles, H. Wang, and L. Fei-Fei, Unsupervised learning of human action categories using spatial-temporal words, The Journal of Computer Vision,vol. 79, no. 3, pp. 299-318, 2008.
13.
P. Scovanner, S. Ali and M. Shah,A 3-Dimensional SIFT descriptor and its application to action recognition, ACM Multimedia,357-362, 2007.
14.
Xinghao Jiang, Tanfeng Sun, Bing Feng, and Chengming Jiang, A space-time SURF descriptor and its application to action recognition with video words, in International Conference on Fuzzy Systems and Knowledge Discovery,1911-1915, 2011.
15.
C. Schuldt, I. Laptev, and B. Caputo, Recognizing human actions: A local SVM approach, in International Conference on Patter Recognition, volume 3, pp. 32 - 36, 2004.

Contact IEEE to Subscribe

References

References is not available for this document.