I. Introduction
Action recognition is to represent and track human actions using computer techniques, and then infer and category actions combined with other information such as background and surrounding environment [1]. The key techniques in the field of action recognition include extracting representative visual features from video sequences, choosing appropriate feature descriptor and designing classification model with a good performance [2]. According to the above analysis, action recognition can be divided into two level tasks: (1) feature extraction and representation at the bottom; (2) model learning and action categorization at the top. We can see the flowchart of general action recognition approach in Fig.1..