By Topic

Hierarchical Filtered Motion for Action Recognition in Crowded Videos

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
YingLi Tian ; IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA ; Liangliang Cao ; Zicheng Liu ; Zhengyou Zhang

Action recognition with cluttered and moving background is a challenging problem. One main difficulty lies in the fact that the motion field in an action region is contaminated by the background motions. We propose a hierarchical filtered motion (HFM) method to recognize actions in crowded videos by the use of motion history image (MHI) as basic representations of motion because of its robustness and efficiency. First, we detect interest points as the two-dimensional Harris corners with recent motion, e.g., locations with high intensities in the MHI. Then, a global spatial motion smoothing filter is applied to the gradients of the MHI to eliminate isolated unreliable or noisy motions. At each interest point, a local motion field filter is applied to the smoothed gradients of the MHI by computing structure proximity between any pixel in the local region and the interest point. Thus, the motion at a pixel is enhanced or weakened based on its structure proximity with the interest point. To validate its effectiveness, we characterize the spatial and temporal features by histograms of oriented gradient in the intensity image and the MHI, respectively, and use a Gaussian-mixture-model-based classifier for action recognition. The performance of the proposed approach achieves the state-of-the-art results on the KTH dataset that has clean background. More importantly, we perform cross-dataset action classification and detection experiments, where the KTH dataset is used for training, while the microsoft research (MSR) action dataset II that consists of crowded videos with people moving in the background is used for testing. Our experiments show that the proposed HFM method significantly outperforms existing techniques.

Published in:

Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on  (Volume:42 ,  Issue: 3 )