Skip to Main Content
This paper presents a novel approach for combining optical flow into enhanced 3D motion vector fields for human action recognition. Our approach detects motion of the actors by computing optical flow in video data captured by a multi-view camera setup with an arbitrary number of views. Optical flow is estimated in each view and extended to 3D using 3D reconstructions of the actors and pixel-to-vertex correspondences. The resulting 3D optical flow for each view is combined into a 3D motion vector field by taking the significance of local motion and its reliability into account. 3D Motion Context (3D-MC) and Harmonic Motion Context (HMC) are used to represent the extracted 3D motion vector fields efficiently and in a view-invariant manner, while considering difference in anthropometry of the actors and their movement style variations. The resulting 3D-MC and HMC descriptors are classified into a set of human actions using normalized correlation, taking into account the performing speed variations of different actors. We compare the performance of the 3D-MC and HMC descriptors, and show promising experimental results for the publicly available i3DPost Multi View Human Action Dataset.