We propose a new method for human action recognition from video streams that is fast and robust to noise and to large changes in camera views. We start by extracting features in the Fourier domain once we obtain the bounding boxes containing the silhouettes of a human for a number of video frames representing a basic action. After preprocessing, we divide each space-time volume into space-time sub-volumes (STSV) and compute their corresponding mean-power spectra as our feature vectors. Our features result in high classification performance even with simple distance measures. We perform an experimental comparison, using the same data, between our method and two state-of-the-art methods.
Published in:
Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on
Date of Conference: 12-15 Oct. 2008