Recognizing action at a distance
Efros, A.A.
Berg, A.C.
Mori, G.
Malik, J.
Comput. Sci. Div., California Univ., Berkeley, CA, USA;
This paper appears in: Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on
Publication Date: 13-16 Oct. 2003
On page(s): 726-733 vol.2
Location: Nice, France,
ISBN: 0-7695-1950-4
INSPEC Accession Number: 7970983
Digital Object Identifier: 10.1109/ICCV.2003.1238420
Current Version Published: 2008-04-03
Abstract
Our goal is to recognize human action at a distance, at resolutions where a whole person may be, say, 30 pixels tall. We introduce a novel motion descriptor based on optical flow measurements in a spatiotemporal volume for each stabilized human figure, and an associated similarity measure to be used in a nearest-neighbor framework. Making use of noisy optical flow measurements is the key challenge, which is addressed by treating optical flow not as precise pixel displacements, but rather as a spatial pattern of noisy measurements which are carefully smoothed and aggregated to form our spatiotemporal motion descriptor. To classify the action being performed by a human figure in a query sequence, we retrieve nearest neighbor(s) from a database of stored, annotated video sequences. We can also use these retrieved exemplars to transfer 2D/3D skeletons onto the figures in the query sequence, as well as two forms of data-based action synthesis "do as I do" and "do as I say". Results are demonstrated on ballet, tennis as well as football datasets.
Index
Terms
Available to subscribers and IEEE members.
References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.