Skip to Main Content
Reliable wide-field detection of human activity is an unsolved problem. The main difficulty is that low resolution and the unconstrained nature of realistic environments and human behaviour make form cues unreliable. Here we argue that reliability in far- or wide-field detection can still be achieved by probabilistic combination of multiple weak but complementary visual cues that do not depend on detailed form analysis. To demonstrate, we describe a real-time Bayesian algorithm for localizing human activity in relatively unconstrained scenes, using motion, background subtraction and skin colour cues. Fast sampling of scale space is achieved using integral images and a flexible norm that can handle sparse cues without loss of statistical power. We show that the probabilistic approach far outperforms a representative logical approach in which skin and background subtraction classifiers are combined conjunctively. Our method is currently used in a pre-attentive human activity sensor, generating saccadic targets for an attentive foveated vision system that reliably fixates faces over a 130 deg field of view, allowing high-resolution capture of facial images over a large dynamic scene.
Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on (Volume:2 )
Date of Conference: 20-25 June 2005