Skip to Main Content
Proposed is a structured discriminative random fields model for human action recognition. To represent the human action in a compact but distinct manner, the motion-constrained SIFT (MoSIFT) algorithm is utilised for salient region extraction and description and Bag of Words is sequentially adopted for feature formulation to convert the action sequence into a feature sequence. With this feature representation, a structured discriminative random fields model can be constructed for action modelling and classification. The contribution of the work is to explicitly learn the visual pattern transition between elementary actions to discover the nature of the entire action rather than modelling the gradual change of visual pattern between adjacent frames in traditional methods. A large-scale experiment showed the accuracy and robustness of this method. Moreover, the proposed method outperforms the representative state-of-the-art methods for human action recognition.