Skip to Main Content
Automated video surveillance applications require accurate separation of foreground and background image content. Cost sensitive embedded platforms place realtime performance and efficiency demands on techniques to accomplish this task. In this paper we evaluate pixel-level foreground extraction techniques for a low cost integrated surveillance system. We introduce a new adaptive technique, multimodal mean (MM), which balances accuracy, performance, and efficiency to meet embedded system requirements. Our evaluation compares several pixel-level foreground extraction techniques in terms of their computation and storage requirements, and functional accuracy for three representative video sequences. The proposed MM algorithm delivers comparable accuracy of the best alternative (Mixture of Gaussians) with a 6X improvement in execution time and an 18% reduction in required storage.